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Preface 



These post-proceedings contain the revised versions of the accepted papers of 
the international workshop “Transactions and Database Dynamics" , which was 
the eighth workshop in a series focusing on foundations of models and languages 
for data and objects (FoMLaDO). 

Seven long papers and three short papers were accepted for inclusion in the 
proceedings. The papers address various issues of transactions and database 
dynamics: 

— criteria and protocols for global snapshot isolation in federated transaction 
management, 

— unified theory of concurrency control and replication control, 

— specification of evolving information systems, 

~ inheritance mechanisms for deductive object databases with updates, 

— specification of active rules for maintaining database consistency, 

— integrity checking in subtransactions, 

— open nested transactions for multi-tier architectures, 

— declarative specification of transactions with static and dynamic integrity 
constraints, 

— logic-based specification of update queries as open nested transactions, and 

— execution guarantees and transactional processes in electronic commerce 
payments. 

In addition to the regular papers, there are papers resulting from two working 
groups. The first working group paper discusses the basis for transactional com- 
putation. In particular, it addresses the specification of transactional software. 
The second working group paper focuses on transactions in electronic commerce 
applications. Among others, Internet transactions, payment protocols, and con- 
currency control and persistence mechanisms are discussed. 

Moreover, there is an invited paper by Jari Veijalainen which discusses tran- 
sactional aspects in mobile electronic commerce. 

Acknowledgments: We are grateful to the members of the program committee 
and others who have reviewed the submitted papers. We are also thankful to 
all authors who have submitted papers to this workshop. Finally, we thank all 
participants of the workshop for the lively discussions. 

December 1999 Gunter Saake 

Kerstin Schwarz 
Can Tiirker 
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Abstract. Federated transaction management (also known as multida- 
tabase transaction management in the literature) is needed to ensure the 
consistency of data that is distributed across multiple, largely autono- 
mous, and possibly heterogeneous component databases and accessed by 
both global and local transactions. While the global atomicity of such 
transactions can be enforced by using a standardized commit protocol 
like XA or its CORBA counterpart OTS, global serializability is not self- 
guaranteed as the underlying component systems may use a variety of 
potentially incompatible local concurrency control protocols. The pro- 
blem of how to achieve global serializability, by either constraining the 
component systems or implementing additional global protocols at the 
federation level, has been intensively studied in the literature, but did 
not have much impact on the practical side. A major deficiency of the 
prior work has been that it focused on the idealized correctness criterion 
of serializability and disregarded the subtle but important variations of 
SQL isolation levels supported by most commercial database systems. 
This paper reconsiders the problem of federated transaction manage- 
ment, more specifically its concurrency control issues, with particular 
focus on isolation levels used in practice, especially the popular snapshot 
isolation provided by Oracle. As pointed out in a SIGMOD 1995 paper 
by Berenson et ah, a rigorous foundation for reasoning about such con- 
currency control features of commercial systems is sorely missing. The 
current paper aims to close this gap by developing a formal framework 
that allows us to reason about local and global transaction executions 
where some (or all) transactions are running under snapshot isolation. 
The paper derives criteria and practical protocols for guaranteeing glo- 
bal snapshot isolation at the federation level. It further generalizes the 
well-known ticket method to cope with combinations of isolation levels 
in a federated system. 



1 Introduction 

1.1 Reviving the Problem of Federated Transactions 

With the ever-increasing demand for information integration both within and 
across enterprises, there is renewed interest in providing seamless access to mul- 
tiple, independently developed and largely autonomously operated databases. 
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Such a setting is known as a database federation or heterogeneous multidata- 
base system. More specifically, the approach of building an additional integration 
software layer on top of the underlying component systems is referred to as a fe- 
derated database system [13,21,8,9,10,17]. Among the challenges posed by such 
a system architecture is the problem of enforcing the consistency of the data 
across the boundaries of the individual component systems. Transactions that 
access and modify data in more than one component system are referred to as 
federated or global transactions [4,5]; for example, an electronic commerce ap- 
plication could require a transaction to update data in a merchant’s database as 
well as the databases of a credit card company and a service broker that pointed 
the customer to the merchant and requests a provisioning fee for each sale. Pro- 
viding the usual ACID properties for such federated transactions is inherently 
harder than in a homogeneous, centrally administered distributed database sy- 
stem, one reason being that the underlying component systems of a federation 
may employ different protocols for their local transaction management. A ca- 
nonical example is the following schedule of three transactions, ti, t 2 , and 
that read and write data objects x, y, and z in two databases, DBi and DB 2 , 
managed by two different database systems (or differently configured instances 
of the same database system) with different concurrency control protocols: 

DBi : ri{x)wi{x) Ci r2{x)w2{x) C 2 

DB2:r3{y) ri{y)wi{y) Ci r2{z)w2{z) C 2 T3{z)C3 

Both local schedules, as seen by each of the two component systems alone, are 
serializable, but the problem is that the resulting serialization orders are incom- 
patible from a global viewpoint. In DBi, the serialization order requires that 
ti precede t 2 , whereas in DB 2 , the only possible serialization order is the one 
in which ^2 precedes and precedes ti. Thus, the overall execution of the 
three transactions is not globally serializable and may potentially result in data 
inconsistencies. 

These kinds of problems with regard to federated transactions have been in- 
tensively studied in the late eighties and early nineties. The proposed solutions 
range from imposing additional constraints on the transaction protocols of the 
underlying component systems to building an additional transaction manager in 
the federated software layer on top of the component systems to reconcile or con- 
trol the underlying local executions. The most notable result in the first category 
probably is that global serializability is self-guaranteed if all component systems 
allow only conflict-serializable schedules where the commit order coincides with 
the serialization order [6,7,19,22], with rigorous schedules being the most impor- 
tant special case [6]. In the second category, the family of ticket methods [11] has 
been among the most promising approaches, complementing knowledge about 
local serialization orders with graph-cycle testing at the federation level. 

All this ample work has led to interesting theoretical insights, but appears 
to have made little impact on the practical side. Consequently, the subject of 
federated transactions has not been pursued further by the research community 
in the last few years. The current paper aims to revive the subject, pushing 
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it in a new and practically relevant direction. We believe that the prior work 
has not succeeded in the intended technology transfer because it made some 
fundamental assumptions that did not match the typical setting of commercial 
database systems and their real-life applications. Essentially, the prior work made 
too liberal assumptions on the recovery protocols of practically used systems, and 
it made too restrictive assumptions on their local concurrency control protocols: 

— On the recovery side, the point that the underlying component systems are 
largely autonomous and therefore may not necessarily be willing to coope- 
rate for ensuring the atomicity of global transactions has been way overrated. 
Today, standardized distributed commit protocols like XA or its CORBA 
counterpart OTS [14] are supported by virtually all commercially relevant 
database systems, request brokers (TP monitors, ORBs, etc.), and even pack- 
aged business-object software systems. Rather than building complex reco- 
very protocols at the federation level, it is much easier and more manageable 
to rely on those protocols for global atomicity. For long-running workflow- 
style applications, a distributed commit protocol would admittedly incur 
severe performance problems, but for reasonably short federated transac- 
tions (e.g., the kind that arises in electronic commerce applications) it is a 
perfectly viable solution. 

— On the concurrency control side, virtually all prior work assumed that the 
component systems would guarantee at least conflict-serializability for their 
local schedules. In real life, however, various kinds of isolation levels are 
provided by commercial database systems and widely used by performance- 
conscious application developers. These include the isolation levels of the 
SQL standard, such as “read committed” (known as “cursor stability” in 
some commercial systems), as well as vendor-specific options like Oracle’s 
snapshot isolation feature. None of the prior work on federated transac- 
tion management has taken these isolation level options into account, de- 
spite their undebatable practical relevance. Even worse, there is generally 
a wide gap in the theoretical foundation of transaction management as far 
as such concurrency tuning options are concerned. To our knowledge, the 
only exceptions are the remarkable paper by Berenson et al. [2], discussing 
SQL-standard as well as vendor-specific isolation levels from a conceptual 
viewpoint without truly foundational ambitions, however, and the work by 
Atluri et al. [1] extending the classical serializability theory to incorporate 
weaker isolation levels from a formal and rather abstract viewpoint. Both 
papers restrict themselves to a centralized database setting. 



1.2 Contribution and Outline of the Paper 

The current paper aims to narrow the aforementioned gap in the foundation 
of transaction management, while also pursuing practically viable solutions for 
federated transaction management in the presence of isolation levels different 
from standard serializability. We essentially concentrate on concurrency control 
issues, disregarding recovery for the fact that standardized distributed commit 
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protocols like XA and OTS can ensure global atomicity across system bound- 
aries, as discussed above. We focus on conventional types of transactions, the 
key problem being their federated, heterogeneous nature, and do not address 
workflow-style activities that may run for hours or days. Further note that the 
problem of federated concurrency control arises even if the same database sy- 
stem is used for all component databases of the federation, since the system may 
be configured with different isolation level options for different databases (and 
transaction classes). 

Our approach is largely driven by considering the capabilities of commer- 
cial systems. In particular, Oracle provides interesting options that are fairly 
representative for the subtle but important deviations from the pure school of 
serializability theory. Like several other commercial systems, Oracle supports 
transient versioning for enhanced concurrency, and exploits versions, by means 
of a timestamp-based protocol, to offer the following two isolation level options 
[15,16]: 

1. A transaction running under the “read committed” option reads the most 
recent versions of the requested data objects that were committed at the 
time when the read operation is issued. Note that these read accesses to 
committed versions can proceed without any locking. All updates that the 
transaction may invoke are subject to exclusive locking of the affected data 
objects, and such locks are held until the transaction’s commit. 

2. For transactions running under the “snapshot isolation” level, all operati- 
ons read the most recent versions as of the time when the transaction be- 
gan, thus ensuring a consistent view of the data through the transaction. 
A particularly beneficial special case is that all read-only transactions are 
perfectly isolated in the sense of the multiversion serializability theory [3]. 
For read-write transactions, on the other hand, the sketched protocol cannot 
ensure (multiversion) serializability. In addition, Oracle performs the follo- 
wing check upon commit of a transaction: if any data object written by the 
transaction has been written by another, already committed transaction that 
ran concurrently to the considered one (i.e., committed after the considered 
transaction began), then the current transaction is aborted and rolled back. 
Additionally, exclusive locks are used for updates and held until the tran- 
saction’s commit. If a transaction has to wait for a lock and the transaction 
holding that lock commits, the waiting transaction is aborted. This is a spe- 
cial case of the commit-time test, allowing prevention of concurrent writes 
before they happen. 

Disallowing concurrent writes aims to provide an additional level of sanity, and 
Oracle even advocates this option under the name “serializability” . Nonetheless, 
the protocol cannot ensure full (multiversion) serializability, with the following 
schedule as a counterexample {xi denotes the version of x generated by transac- 
tion ti, and to is a fictitious initializing transaction): 



ri{xo)ri{yo)r2{xo)r2{yo)wi{xi)CiW2{y2)C2 
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The example may lead to inconsistent data, for example, violating a constraint 
such as a: + y < 100 although both transactions alone would enforce the con- 
straint. However, given that such anomalies are very infrequent in practice, the 
protocol is widely popular in Oracle applications. 

As pointed out by Berenson et al. [2], such non-serializable isolation levels, 
and especially the snapshot isolation option, are extremely useful in practice, 
despite their lack of theoretical underpinnings. Since Oracle would obviously be 
a major player also in a federated database setting, we concentrate in this pa- 
per on understanding the impact of snapshot isolation for federated transaction 
management. In fact, the work presented here is part of a larger effort to build a 
federated database architecture, coined VHDBS, that currently supports Oracle 
and O 2 databases [23,24,12,20]. 

The paper’s “plan of attack” proceeds in two steps, leading to the following 
contributions: 

— We develop a formal model that allows us to reason about isolation levels 
in the context of federated transactions, and we derive results that relate 
local and global isolation levels. Since snapshot isolation exploits transient 
versioning, all our formal considerations are cast in a multiversion schedule 
framework. 

— Based on these theoretical results, we develop new algorithms for federated 
concurrency control, to ensure global snapshot isolation or global (conflict-) 
serializability, whatever the application demands. The latter algorithm is 
based on a generalization of the ticket method to cope with component 
systems that provide snapshot isolation. 

The rest of the paper is organized as follows. In Section 2, we introduce the 
basic model and notations, and we develop the theoretical underpinnings for 
coping with snapshot isolation in concurrent transaction executions. In Section 
3, we derive results on how to relate the isolation levels in local schedules with 
the desired correctness criteria at the federation level, and develop an algorithm 
for ensuring global snapshot isolation under the assumption that all component 
systems guarantee local snapshot isolation. Section 4 develops a practically via- 
ble protocol to guarantee global serializability in a federated system with some 
component systems providing snapshot isolation, extending and generalizing the 
ticket method of [11]. We conclude the paper with an outlook on future work. 
All proofs of theorems are given in the paper’s Appendix. 



2 Basic Model and Notation 

This section introduces the formal apparatus that is necessary for our study 
of snapshot isolation. We briefly introduce the notation and some results of 
the standard theory of (multiversion) serializability in Subsection 2.1, and then 
develop a formal characterization of snapshot isolation in Subsection 2.2. 
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2.1 Preliminaries 

Definition 1 (transaction). A transaction U is a sequence, i.e., total order 
<C, of read and write actions on data objects from one or more databases, along 
with a set of begin and commit actions, one for each database accessed by U, such 
that all begin actions precede all read and write actions in the order <C and all 
commit actions follow all other actions w.r.t. <C. The readset and writeset of ti 
are the sets of objects for which ti includes read and write actions, respectively. 

We denote the j-th read or write step of U by aij(x) where x is the accessed 
object, or rij{x) (for reads) and Wij{x) (for writes) when the kind of actions is 
relevant. When we want to indicate the database to which a step refers, we extend 
the notation for an action as follows: a\'^\x) denotes the j-th step of transaction 
ti accessing object x that resides in database DB^. The begin and commit actions 
for database DBk, explicitly denoted whenever they are relevant, are written as 
and respectively. □ 

For all transactions, we further restrict the sequence of read and write accesses 
to allow a write on object x only if there the transaction includes also a read on 
X that precedes the write in the action order <C. Furthermore, we allow at most 
one read and at most one write on the same object within a single transaction. 
None of these two properties present serious restrictions of the executions that 
we can model; in practice, writes are usually preceded by reads, and multiple 
reads or writes to the same object can easily be eliminated by using temporary 
program variables. 

Note that we do not consider partial orders within a transaction for the sake 
of simpler notation, although this relaxation could be incorporated in the model 
quite easily. Similarly, disallowing multiple reads or writes on the same object is 
also only a matter of notation. The only “semantically” relevant restriction of our 
model is that we do not consider transaction aborts and assume all transactions 
to be committed. This paper addresses federated concurrency control; extensions 
to incorporate recovery issues in the formal model would be the subject of future 
work. 

Definition 2 (global and local transactions). A global transaction (GT) 
is a transaction that accesses objects from at least two different databases. In 
contrast, a local transaction (LT) accesses only a single database. The projection 
of a global transaction ti onto a database DBk is the set of actions ofU that refer 
to objects from DBk, along with their corresponding order <C. This projection 
will be referred to as a global subtransaction (GST) and denoted by t\^\ □ 

Note that the dichotomy between GTs and LTs is a purely syntactic one in our 
definition. In practice, an additional key difference is that LTs are not routed 
through an additional federation software layer but rather access a database 
directly through the native database system and nothing else. 

Definition 3 (schedule, multiversion schedule, monoversion schedule). 

A schedule of transactions T = {ti,...} is a sequence, i.e., total order <Si, of the 
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union of the actions of all transactions in T such that the action ordering within 
transactions is preserved. 

A multiversion schedule of transactions T = {ti,...} is a schedule with an 
additional version function that maps each read action ri (x) in the schedule to 
a write action Wj{x) that precedes the read in the order <C. The read action is 
then also written as ri{xj) with Xj denoting the version created by the write of 
tj. 

A monoversion schedule of transactions T = {ti,...} is a multiversion sche- 
dule whose version function maps each read action ri{x) to the most recent write 
action Wj{x) that precedes it (i.e., wj{x) <C ri(x) and there is no other write ac- 
tion on X in between). □ 

The usual correctness criterion for multiversion concurrency control is that a 
given multiversion schedule should be view-equivalent to a serial monoversion 
schedule, with view-equivalence being defined by the reads-from relation among 
the actions of a schedule [3,18]. 

Definition 4 (MVSR). A multiversion schedule s is multiversion serializable 
(MVSR), if it is view-equivalent to a monoversion serial schedule. □ 

Definition 5 (version order, MVSG). Given a multiversion schedule s and 
an object x, a version order is a total order of the versions of x in s. A version 
order for s is the union of the version orders for all objects. 

Given a multiversion schedule s and a version order <C for s, the multiversion 
serialization graph for s and <C, MV SG{s,<^), is the directed graph with the 
transactions as nodes and the following edges: 

— For each operation rjfxi) in the schedule there is an edge U — >■ tj (WR pair). 

— For each pair of operations rk{xj) and Wifxi) where i, j and k are distinct, 
there is an edge 

(i) ti — >■ tj, if Xi <C Xj (WW pair), 

(ii) tk — >■ ti, if Xj <C Xi (RW pair). □ 

The following theorem is the most important characterization of MVSR by means 
of the MVSG, serving as the basis for correctness proofs of practical multiversion 
concurrency control protocols (see, e.g., [3]). 

Theorem 1 (characterization of MVSR [3]). A multiversion schedule s is 
in MVSR if and only if there exists a version order <C such that MV SG{s,^) 
is acyclic. □ 

When we consider only (non-serial) monoversion schedules (for which each write 
is preceded by a read of the same transaction) and wish to reason about their cor- 
rectness, the criterion of MVSR automatically degenerates into the well-known 
notion of conflict-serializability (SR) [18]. This results from the restriction of the 
version function. So with this restriction, MVSR becomes equivalent to the stan- 
dard definition of conflict-serializability (SR) based on read-write, write-write, 
and write-read conflicts (and the corresponding conflict-graph construction). 
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2.2 Formalizing the Snapshot Isolation Level (SI) 

50 far we have introduced the traditional apparatus of multiversion serializa- 
bility. We are now ready to discuss relaxations, in the sense of isolation levels, 
within this model. In this paper we concentrate on the situation where an entire 
schedule satisfies a certain relaxed isolation level. The general case where only a 
specific subset of transactions tolerates a relaxed isolation level and the “rest of 
the schedule” should still be serializable or MVSR is the subject of future work. 

Definition 6 (snapshot isolation). A multiversion schedule of transactions 
T = {ti,...} satisfies the criterion of snapshot isolation (SI) if the following two 
conditions hold: 

(SI-V) SI version function.- The version function maps each read action ri{x) 
to the most recent committed write action Wj{x) as of the time of the 
begin ofti, or more formally: 

Vi{x) is mapped to wj{x) such that Wj{x) <C Cj Bi ri(x) and 
there are no other actions Wh{x) and Cf (h ^ j) with Wh{x) <C Bi and 
Cj <S:Ch<^ Bi. 

(STW) disjoint -writesets.- The writesets of two concurrent transactions are 
disjoint, or more formally: 

if for two transactions tt and tj, either Bi <C Bj <C Ci or Bj Bi 
Cj, then ti and tj must not write a common object x. □ 

51 is weaker than MVSR in the sense that it allows non-serializable schedules, 
for example the following: 

ri {xo)ri{yo)r2{xo)r2{yo)wi{xi)w2{y2)CiC2 

This schedule satisfies both (SI-V) and (SI-W), but it is not equivalent to a serial 
monoversion schedule. On the other hand, SI is not a superset of MVSR, because 
there are serializable multiversion schedules that are not in SI, for example 

ri{xo)r2{yo)w2{y2)C2ri{y2)wi{yi)Ci 

This schedule is not in SI because ri{y) is mapped to ^2(1/2) rather than wo(xq), 
and because ti and O are concurrent and write the same object y. It is, however, 
equivalent to the serial monoversion schedule 

X2 (y)w2(y)C2ri (x)ri (y)wi (y)Ci 

The key point here is that this schedule uses a version function different from 
the SI version function. 

We now characterize SI membership of a given multiversion schedule by the 
absence of cycles in a graph. 

Definition 7 (SI version order). The SI version order <C« is the order that 
is induced by the order of the commit operations of the transactions that wrote 
the versions, or more formally: 

Xi Xj ."v^ Ci Cj 



□ 
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Definition 8 (SI-MVSG). The SI multiversion serialization graph STMVSG 
for a given multiversion schedule s satisfying the property (STV) is a directed 
graph with the transactions as nodes and the following edges: 

— For each operation rj{xi) in the schedule there is an edge ti — >■ tj, labeled 
with “x” (WR pair). 

— For each pair of operations rk{xj) and Wifxi), there is an edge 

(i) ti — >■ tj, if Xi <Cs Xj (WW pair), 

(ii) tk — >■ ti, if Xj <Cs Xi (RW pair), 
labeled with “x” in both cases. 

Edges in the graph that are labeled with “x” are called x-edges. A cycle in the 
graph that consists completely of x-edges is called an x-cycle. □ 

SI-MVSG is exactly the same as the usual multiversion serialization graph MVSG 
defined above, extended by the edge labels. Similarly to the MVSG theorem cited 
in Subsection 2.1, the following theorem gives a characterization for schedules 
that are snapshot isolated: 

Theorem 2 (Equivalence of SI and cycle-free SI-MVSG). A multiversion 
schedule satisfying (SI-V) is 

a) in MVSR, if and only if its corresponding SI-MVSG is acyclic, and 

b) in SI, if and only if there is no object x such that the SI-MVSG has an x- 

cycle. □ 

As an example, consider the following two schedules: 



si := ri{xo)ri{yo)r2{xo)r2{yo)wi{xi)w2{y2)CiC2 ( 1 ) 

s 2 := ri{xo)ri{yo)r2{xo)r2{yo)wi{xi)w2{x2)CiC2 ( 2 ) 

The schedule Si is in SI while S 2 is not (the transactions are concurrent and 
both write x). For both schedules, the corresponding SI-MVSG, shown in Figure 
1, contains a cycle; thus neither of the two schedules is MVSR. But only the 
MVSG for S 2 (on the right of Figure 1) contains a cycle that consists solely of 
edges labeled by the same object x. So S 2 is not SI whereas si is SI. 




Fig. 1. SSI-MVSG for two schedules 
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3 Guaranteeing Global Snapshot Isolation 

3.1 The Problem of Global Snapshot Isolation 

The goal of this section is to derive necessary and sufficient conditions for sche- 
dules of federated transactions to be globally SI, provided that all underlying 
component systems generate only schedules that are locally SI. In the following, 
we assume that all federated transactions employ an atomic commit protocol 
(AGP) across the affected component systems with a conceptually centralized 
coordinator such that 

— the relative order of the local commit operations of different global transac- 
tions is the same in all component systems (i.e., if ti commits before tj in 
DBi, then ti must commit before tj in DB2 etc. as well) and 

— the local commit operations are totally ordered among all transactions, i.e., 
no two commit operations happen concurrently. 

With these assumptions, the local commit operations of a transaction can be 
safely replaced by the global commit operation in all local schedules. 

The following example shows that it is not at all trivial to guarantee global 
SI even if all component systems enforce local SI. 

DBi : r[^\ao)r!^^\xo)w!2\x2) C2r^i\xo) Ci 

DB 2 : r^2\yo)wi^\y2) C 2 Cl 

In database DBi, the subtransactions and t^'^ of the two global transac- 
tions ti and t2 run concurrently, so they both read from a prior transaction 
(here this is the fictitious, initializing transaction to)- Only t^^'^ writes an object. 

This subschedule therefore is SI. The other subschedule, in database DB2, is SI, 

f2) (2) 

too: the two subtransactions t\ ’ and t\ run serially and they both read the 
correct values according to the SI version function. However, combining the two 
subschedules into a global schedule (with database superscripts omitted) yields: 

c(ao)»'2(a;o)w2(x2)r2(j/o)w2(2/2)C2ri(a;o)ri(2/2)wi(2/i)Ci 

This schedule is not snapshot isolated: C reads y from t2 rather than to (so 
(SI-V) does not hold), and both transactions run concurrently and both write 
y (so (SI-W) does not hold either). The problem lies in the fact that the local 
scheduler in DB2 does not know when t2 started at the global federation level, 
so it fails to assign the globally correct versions, restricting itself to local correc- 
tness. Analogous arguments hold for the consideration of global writesets versus 
local writesets and global concurrency versus local concurrency. The following 
two subsections present two algorithms that overcome these problems and gua- 
rantee that the global schedule is SI. The first, pessimistic approach, discussed 
in Subsection 3.2, achieves its objective by synchronizing the local begin ope- 
rations of transactions (in addition to the implicit synchronization of the local 
commits that results from the AGP). The second, optimistic approach, discussed 
in Subsection 3.3., is based on testing the potential violation of global SI. 
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3.2 Pessimistic Approach: Synchronizing Subtransaction Begins 

An idea towards a globally correct execution is to start all local subtransactions 

(k) 

upon the first operation of the global transaction ti. The intuition behind 
this is that we want all subtransactions of a global transaction to have the same 
start time as the global transaction in the global schedule. If we can achieve 
this, any violation of the SI property of the global schedule could be mapped to 
a corresponding violation in one of the local schedules, and the latter would be 
prevented by the fact that all local schedulers generate SI schedules. 

The obvious way to “simultaneously” start the local subtransactions is by 
issuing an explicit local begin in each component system DB^ on which the 
transaction will possibly issue read or write operations. Unfortunately, this alone 
does not entirely solve the problem. It may occur that a global transaction t\ 
commits while another transaction t2 is in the process of submitting its local 
begin operations. If in some database the begin operation is executed before the 
commit of t\ and this order is reversed in another database, global SI property 
can still not be guaranteed. The following part of a schedule illustrates this 
problem: 

Cl r!^^\xo) C2 

DB2 : rf\yo)wf\yi) Ci Bf'^ rP{yi)wf\y2) C2 

This execution is not globally SI. However, both local schedules are SI, so merely 
adding the local begin operations did not fix the problem. 

Obviously, the above problem arises because a global commit is executed 
in between several local begin operations of the same federated transaction. 
So we need to delay the commit request of a global transaction when another 
transaction is executing its begin operations, until all local begin operations have 
returned. Likewise, we have to delay the begin operations if another transaction 
is already in the process of committing, until the commit is finished. 

With this kind of begin synchronization and the implicit synchronization by 
the AGP, we can now assume that all subtransactions of a global transaction 
ti start at the same time Bi and end at the same time Ci in all underlying 
component systems. Therefore, the version functions of the various component 
systems are “synchronized” as well. This consideration leads us to a criterion for 
globally correct execution, as expressed by the following theorem. 

Theorem 3. Let s be a schedule in a federated database system whose compo- 
nent systems guarantee local SI for the local schedules (and with an ACP based 
on a conceptually centralized coordinator) . If each federated transaction issues 
local begin operations on each potentially accessed component system and the fe- 
deration level prevents global commit operations from being concurrently executed 
with a transaction’s set of local begin operations, then s is snapshot isolated. □ 

This approach is relatively easy to implement, but it has potential performance 
problems: At the start time of a federated transaction, it is not necessarily known 
which component systems the transaction may access during its execution (nor 
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can this information be easily inferred from the transaction program) . To be on 
the safe side, local begin operations would have to be executed in a conserva- 
tively chosen set of component systems if not even all component systems of 
the federation. Although each of the local begin operations is fairly inexpensive, 
the total overhead may incur a significant penalty in terms of the transaction 
throughput. Even worse, since global commit operations may have to be delayed 
by begin operations, there is also a potentially severe performance drawback as 
far as transaction response time is concerned. 



3.3 Optimistic Approach: Testing for Global SI Violations 

Looking again at the example in the previous Subsection 3.2, we observe a spe- 
cific situation: the two transactions are concurrent in one database while 

having a reads-from relationship in the serial execution in the other database 
{DB 2 ). This type of situation is in fact the only case when global SI can be 
violated despite all component systems enforcing local SI. Note that the concur- 
rent execution in one database implies that the writesets of the two transactions 
must be disjoint because of the local SI property. However, this concurrent exe- 
cution may result in a version function that is incompatible with the version 
function of the second component system. Further note that in the component 
system where the subtransactions are concurrent, no reads-from relationship is 
possible between the two transactions. So it seems that all we need to do is to 
avoid that two transactions are a) concurrent in one database and b) have a (se- 
rial) reads-from relationship in another database. Unfortunately, the following 
example shows that this consideration does still not capture all relevant cases, 
as it disregards schedules where one transaction accesses only a proper subset of 
the databases accessed by the other transaction (e.g., only one out of DBi and 

DB^y. 

DB^:r^^\ao) Ci 

DB 2 : rf\xo)wf\x2)C2rf\x2)wf\xi) Ci 

This global schedule is not SI, because both transactions are concurrent and 
write the same object x. However, our above consideration does not capture 
this unacceptable case, because all operations of t\ are strictly after those of t2 
in DB 2 , and t 2 does not have any operations in DB\. In the approach of the 
previous Subsection 3.2 with synchronized begin operations, such an execution 
would be impossible because the local begin operation of t\ in DB 2 would make 
t\ locally start before t2 so that t\ would read xq rather than X2- However, the 
synchronized begin operations would have performance drawbacks that we would 

like to avoid in the current approach. Here, to fix the above problem, we first 

ik') 

extend our model of a schedule by introducing additional pseudo-operations B] ' 

ik') 

and C\ for each transaction ti and each database OBj^. that is not accessed by 
The ordering of the artificial operations is such that it is compatible with all 
real commit operations in the other databases and arbitrary with regard to read 

(k) 

and write operations, and the artificial B\ ' operations are placed immediately 
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before the corresponding local commit. Unlike the local begin operations of the 
previous Subsection 3.2, the newly introduced operations here are indeed only 
placeholders for the purpose of correctness reasoning about federated schedules; 
they do no have to be issued in the real system. The net effect of this syntactical 
extension is that we can now easily tell, in our model, for each pair of transactions 
ti and tj and each DBk if fully precedes denoted (or more 

precisely: <C or if fully precedes or if and are 

concurrent, the latter being denoted as 



Theorem 4 (characterization of the global SI version function). Let s 

be a schedule, extended in the above way, that is executed in a federated database 
system (using an ACP) whose component systems guarantee local SI. The version 
function of s, that is, the union of the version functions of the underlying local 
schedules, satisfies (SI-V) if and only if there are no two transactions ti and tj 
and no two databases DBk and DBi such that tf^'^ < t^!^\ t^^ reads from tf^\ 



and tf'’\\tf\ 



□ 



From this necessary and sufficient condition for a global schedule to satisfy (SI- 
V), we can further derive the following theorem that tells us when a global 
schedule is SI. 



Theorem 5 (sufficient and necessary condition for global SI). Let s 

be an extended schedule in a federated database system (using an ACP) whose 
component systems guarantee local SI. Then s is SI if and only if there are no two 

(k) (k) 

transactions ti and tj and no two databases DBk and DBi such that t] <^tj , 
tj'^^ reads from tf^\ andtf^\\tj'\ □ 

This theorem forms the basis of an algorithm to enforce global SI, under the as- 
sumption that all component systems guarantee local SI. The idea is to monitor 
the three conditions of the theorem and take appropriate actions at the federated 
level when all three are conjunctively violated for a pair of transactions. Moni- 
toring the property whether two transactions execute serially or concurrently 
in one component system is straightforward. Monitoring the reads-from relati- 
onship, however, is very difficult if not infeasible in a practical system setting. 
The problem is that we cannot tell by merely observing the operation requests 
and replies at a component system’s interface which version, identified by the 
creating transaction’s number, was read by a read operation. So we need to 
build an actual algorithm for global SI on a coarser version of the last theorem 
where the reads-from condition is omitted and thus the set of allowed schedules 
is restricted even further. 



Corollary 1 (sufficient condition for global SI). Let s be an extended sche- 
dule in a federated database system (using an ACP) whose component systems 
guarantee local SI. Then s is SI if there are no two transactions ti and tj and 
no two databases DBk and DBi such that <C t^^ and tf'^\\t^j\ □ 
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So for global SI to hold, the situation to be disallowed is that the subtransactions 
of two global transactions are running concurrently in one database and serial 
in another one, regardless of whether one transaction reads from the other tran- 
saction or not. This leads to the following algorithm for ensuring the SI property 
of global schedules. 

In the federation layer, we assign a unique timestamp ^ to the subtransac- 
tion of transaction ti in database DB^ before it issues its first operation there. 
This does not require any additional operation on the component system, only 
some bookkeeping at the federation level. When transaction ti requests its global 
commit, we assign a timestamp Ci to it. In addition, we keep information about 
the relative execution order of subtransactions in an array R[i,j], where 



R[hj] 



' ( k) 

serial, if there is a database DBk where Ci < Bj 

(k) ffc) 

concurrent, if there is a database DBk where Bl ' < Bj < Ci 
or Sf ^ ^ < Cj 

undefined, if at least one of the transactions has not yet submitted 
any operation 



Initially, there are no transactions in the system, so R is empty. Once a tran- 
saction ti enters the system, all the b[^'^ and Ci values are set to oo, and 
R[i,j] = R[j,i] = undefined for all active transactions tj. Upon the first opera- 

(k) 

tion of transaction ti in database DBk, the timestamp B^ ’ is assigned and R is 
updated as follows: 



for all transactions tj in the system do 

if (Q B (A:)) then / / serial execution 

if {R[i,j\ = concurrent) then abort ti 

else R[i,j]-=R[j,i]-= serial 

(k) (k)\ 

else if [Bj < B) ) then // concurrent execution 
if (R[i,j] = serial) then abort ti 

else R[i, j] •=R[j, i] := concurrent 

When transaction ti attempts to commit, our bookkeeping needs to add 
“pseudo operations” for each database DBk that ti has not accessed at all. To 

(k) 

do so, we set B) = Ci and update R as if U just submitted an operation to 
this database. If adding these pseudo operations does not force ti to abort, we 
allow it to commit. 

Obviously this algorithm can cause unnecessary aborts, since it does not 
take into account the actual reads-from relationship. However, as noted before, 
this is an inescapable consequence of the fact that the federation layer cannot 
easily observe the details of the local executions in a real-life system setting. 
Despite this drawback of possibly unnecessary aborts, the algorithm appears to 
be significantly more efficient than the begin-synchronization approach of the 
previous Subsection 3.2. In particular, the presented algorithm incurs overhead 
only for those component systems that are actually accessed, and does not delay 
any begin or commit requests. 
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3.4 Remarks On Global Snapshot Isolation with Non-SI Component 
Systems 

In the previous two subsections we have shown how to guarantee global SI when 
all component systems guarantee local SI. It would, of course, also be a desirable 
option to enforce global SI even if some component systems may support a 
correctness criterion other than local SI. A particularly interesting combination 
would be one database supporting local SI and another database supporting 
standard conflict-serializability. Assume that a component system guarantees 
conflict-serializability in combination with the avoidance of cascading aborts 
(AC A). Then the first read operation of a transaction establishes a reads-from 
relationship that constrains the feasibility of subsequent reads if we want to 
guarantee that the resulting schedule is also locally SI (i.e., actually a member 
of the schedule class SR fl ACA n SI) . The transaction must not read any data 
that are committed later than at the time of that first read. Brute-force methods 
to enforce this constraint are conceivable (e.g., by aborting all transactions that 
initiate a commit between our transaction’s first read and its commit), but it is 
very likely that they restrict the possible concurrency in an undue manner. This 
problem alone has discouraged us from proceeding further along these lines. In 
addition, one would have to ensure that the version functions of the different 
local schedules are compatible and that writesets of concurrent transactions are 
disjoint, both at the level of global transactions. Obviously, this setting calls for 
future research. 



4 Guaranteeing Global Serializability 

Although SI is a popular option in practice, there are certainly many mission- 
critical applications that still demand the more rigid correctness criterion of 
global (conflict-) serializability (SR). In this section, we study the problem of 
ensuring global SR in the presence of component systems that merely provide 
SI. We develop an algorithm that extends the well-known ticket method [6,11] 
so that it supports local SI schedulers, while guaranteeing global SR. In Subs- 
ection 4.1, we briefly review the standard ticket method and discuss its benefits 
and potential shortcomings. In Subsection 4.2, we show that, with a minor ex- 
tension, the ticket method yields correct executions even when applied to local 
SI schedulers, but point out that this approach has certain severe drawbacks. 
In Subsection 4.3, we Anally present a generalization of the ticket method that 
increases performance for applications with a large fraction of read-only (sub-) 
transactions. 



4.1 Benefits and Potential Shortcomings of the Ticket Method 

A ticket is a dedicated data object of type “counter” (or some other numerical 
type) in a component, used solely for the purpose of global concurrency con- 
trol. Each global subtransaction must read the current value of the ticket and 
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write back an increased value at some point during its execution (the so-called 
take-a-ticket operation). The ordering of the ticket values read by two global 
subtransaction reflects the local serialization order of the two subtransactions. 
A ticket graph is maintained at the federation level to detect incompatible local 
serialization orders. This graph has the global transactions as nodes, and there 
is an edge from transaction ti to transaction tj if the ticket of U is smaller than 
that of tj in some component database. It has been shown in [11] that the global 
schedule is serializable if and only if the ticket graph does not contain a cycle, 
provided that all component systems generate only locally SR schedules. 

This method is easy and efficient to implement; one merely needs to add the 
take-a-ticket operation to each subtransaction and to manage the global ticket 
graph, i.e., searching for cycles when a global transaction attempts to commit 
(or at some earlier point). Upon detecting a cycle, one (or more) of the involved 
transactions must be aborted. For component systems with the property of al- 
lowing only rigorous local schedules, it is not even necessary to take a ticket at 
all; rather the time of the commit operation can be used as an implicit ticket. So 
the ticket method has the particularly nice property of incurring overhead only 
for non-rigorous component systems, even with transactions that access both 
rigorous and non-rigorous component systems. This makes the ticket method a 
very elegant and versatile algorithm for federated concurrency control. 

The ticket method has two potential problems, however. First, the ticket 
object may be a potential bottleneck in a component database, since all global 
subtransactions have to read and write the ticket. Second and much more se- 
verely, having to write the ticket turns read-only transactions into read-write 
transactions, thus preventing the possibility of specific optimizations for read- 
only (sub-) transactions (e.g., Oracle’s “Set Transaction Read Only” option). 

4.2 Applying the Ticket Method to SI Component Systems 

The standard ticket method requires that local schedulers generate (conflict-) 
serializable schedules. Thus, it is not clear if the method can incorporate compo- 
nent systems that provide non-serializable SI schedules. Consider the effects of 
the additional take-a-ticket operation to the execution of global subtransactions 
on an SI component system. For each pair of concurrent global subtransactions 
on such a database, both write at least one common object, the ticket, so that 
one of them must inevitably be aborted to ensure local SI. The resulting local 
schedule is therefore trivially serializable, because it is in fact already serial, and 
the ordering of the ticket values of two global subtransactions reflects their local 
serial(ization) order. Note that this does not make the entire global transactions 
serial; concurrency is still feasible in other (non-SI, but SR) databases. Nevert- 
heless, sequentializing all global subtransactions in an SI component system is a 
dramatic loss of performance and would usually be considered as an overly high 
price for global consistency. 

What about local transactions on an SI component system, i.e., transactions 
that solely access this database and are not routed through the federation level? 
Those transactions do not have to take a ticket in the original ticket method (as 




Federated Transaction Management with Snapshot Isolation 



17 



any additional overhead for them could possibly be considered as a breach of 
the local database autonomy) . Nevertheless, global serializability is still ensured 
by means of the ticket-graph testing, as cases where local transactions cause the 
serialization order of global transactions to be reversed would lead to local cycles 
and are thus detectable. Unfortunately, with SI component systems, tickets for 
global subtransactions alone are insufficient to detect those critical situations 
caused by local. An example schedule for an SI database with ticket object T, 
global subtransactions ti and t2, and a local transaction is the following: 

r3{yo)ri{yo)wi{yi)ri{To)wi{Ti)Cir2{xo)r2{Ti)w2{T2)C2r3{xo)w3{x3)C3 

The schedule including the ticket operations is SI and even SR with the ticket 
operations removed, but the ticket ordering of ti and t2 contradicts the seria- 
lization order of the ticket-less schedule. 

A possible and in fact the only solution for this problem is to add take-a-ticket 
operations also to all local transactions. This can be done without modifying the 
application programs themselves, for example, by changing the stub code of the 
commit. However, there is still a major problem unsolved, as the forced serial 
execution on an SI component system, discussed above, would now apply to both 
global subtransactions and local transactions. So the resulting performance loss 
would involve the local transactions as well, and this is surely unacceptable 
in almost all cases. In the next subsection we will present a generalization of 
the ticket method to cope with SI databases while avoiding such performance 
problems for the most important case of read-only (sub-) transactions. There 
is, however, no panacea that can cope with the most general case without any 
drawbacks. 

4.3 Generalizing Tickets for Read-Only Subtransactions on SI 
Component Systems 

In this subsection, we present a generalization of the ticket method that allows 
read-only transactions to run as concurrently as the component systems allow 
them (i.e., without additional restrictions imposed by the federated concurrency 
control) . So for application environments that are dominated by read-only tran- 
sactions but exhibit infrequent read-write transactions as well, our approach 
reconciles the consistency quality of global serializability with a sustained high 
performance. 

Each global subtransaction and local transaction has to be marked “read- 
only” or “read-write” at its beginning; an unmarked transaction is supposed 
to be read- write by default. A transaction that is declared as read-only before 
can be re-labeled as read-write at any point during its execution (with certain 
consequences in terms of its performance, however) . A global transaction is read- 
only if all its subtransactions are read-only; otherwise it is a global read-write 
transaction. We will first discuss how to deal with read-write transactions in a 
rather crude, unoptimized way, and then show how read-only transactions can 
be handled in a much more efficient manner. 




18 



R. Schenkel et al. 



Read- Write Transactions. Our extended ticket method requires that all glo- 
bal read-write transactions are executed serially. That is, the federated tran- 
saction manager has to ensure that at most one global read-write transaction 
is active at a time. There is no such restriction, however, for global read-only 
transactions or for local (read-write) transactions. Each read-write subtransac- 
tion of a global read-write transaction takes a ticket as in the standard ticket 
method. Read-only subtransactions of read-write transactions only need to read 
the corresponding ticket object, as further discussed below. Note that tickets are 
still necessary for global read-write transactions to correctly handle the potential 
interference with local transactions. 

Although the sequentialization of global read-write transactions appears very 
restrictive, it is no more restrictive than in the original ticket method if SI 
component systems are part of the federation. On the other hand, our generalized 
method is much less restrictive with regard to read-only transactions and local 
transactions. 



Read-Only Transactions. The problem with read-only transactions is that 
the usual updating of ticket objects would turn them into read-write transac- 
tions, with the obvious adverse implications. A careful analysis of the possible 
cases, however, shows that it is sufficient if read-only (sub-) transactions merely 
read the ticket object. 

A subtransaction’s ticket value shows its position in the locally equivalent se- 
rial schedule. When a component system generates SI schedules, the ticket value 
that a transaction ti reads from the corresponding database DB^ depends only 

(k) 

on the position of its first local operation B> ' . The transaction reads from other 
transactions that were committed before R) , so its position in the equivalent 
serial schedule must be behind them. Its ticket value must therefore be grea- 
ter than that of all transactions that committed earlier. On the other hand, the 
transaction ti does not see updates made by transactions that commit after B> ; 
so it must precede them in the equivalent serial schedule, hence its ticket value 
must be smaller than the tickets of those transactions. Thus, a feasible solution 
is to assign to a read-only subtransaction a ticket value that is strictly in bet- 
ween the value that was actually read from the ticket object and the next higher 
possible value that a read-write subtransaction may write into the ticket object. 
This approach can be implemented very easily. For example, if ticket objects are 
of type integer, we can restrict the actual ticket-object values to even integers 
with read-write subtransactions always incrementing the ticket object by two, 
and for read-only subtransactions we can use the odd integer that immediately 
follows the actually read ticket value for the purpose of building the global ticket 
graph. The fact that this may result in multiple read-only subtransactions with 
the same ticket value is acceptable in our protocol. 

Figure 2 shows an example, with the ticket values that the read-write tran- 
sactions write in the databases denoted by the numbers below the transaction 
boxes. The smaller white and black boxes denote the first operations of two 
read-only transactions tw and ts that span all three databases. Looking at the 
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white transaction tw in the first database, we see that its ticket value must be 
greater than 6 (which is the value that ti wrote), but smaller than 8 (which is 
what t 2 wrote), so a possible ticket value for tiy is 7- The tickets for the other 
subtransactions are assigned analogously. 

Using these ticket values, we can now test if the execution of the two read- 
only transactions in Figure 2 leads to a globally serializable execution, i.e., a 
cycle-free global ticket graph. First consider the black read-only transaction ts- 
In the databases DBi and DB^, it starts its execution after the commit of t 2 
(and ti), so its ticket is greater than that of t 2 and U (for example, the ticket of 
t 2 in DBi is 8, while that of ts would be 9). In DB 2 , ^2 had no operations, and 
ts started after t\ committed. As a result, Ib is always executed after t\ and t 2 
committed; the corresponding ticket value graph has no cycle. In an equivalent 
global serial schedule, ts must be executed after t\ and t 2 - 



T, T2 T3 




8 10 12 
Fig. 2. Assigning a ticket valne to read-only transactions 



As for the white transaction tw, it executes its first operations in DB 2 and 
DB^ before committed, so its ticket value is smaller than that of in both 
databases. On the other hand, it makes its first operation in DBi before t 2 
committed, but in DB^ after t 2 committed. It is therefore possible that tw 
reads values written by t 2 in DB^ but not in DBi, so the state of the global 
database seen by tw could be inconsistent. This is captured by the ordering of 
the tickets of these transactions. The ticket of tw in DBi is smaller than that 
of O (7 vs. 8), while it is greater in DB^ (11 vs. 10). The corresponding ticket 
graph therefore has a cycle between these two transactions which is detected 
when tw attempts to commit. 

Local transactions are incorporated in this protocol as before: read- write 
transactions need to take tickets as usual, whereas read-only transactions me- 
rely need to read the ticket object. These considerations hold for SI component 
systems. Our method can, however, easily be combined with standard tickets for 
other types of component systems, if such databases participate in the federa- 
tion. For example, if some database generates schedules where the serialization 
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order of the transactions reflects the order of their commit operations, the “im- 
plicit ticket” method [11] can be applied for this database (i.e., no explicit tickets 
need to be taken). In databases for which conventional (conflict-) serializability 
is guaranteed, all global subtransactions submit the usual take-a-ticket, whereas 
local transactions do not need to take a ticket as in the original ticket method. 

5 Concluding Remarks 

In the last few years the subject of federated transactions has been largely dis- 
regarded by the research community, despite the fact that the results from the 
late eighties and early nineties do not provide practically viable solutions. A 
major reason for this situation probably is that many application classes are 
perfectly satisfled with a loose coupling of database, no or very little care ab- 
out mutual consistency, and possibly application-level solutions to avoid special 
types of severe inconsistencies. The expected proliferation of advanced applica- 
tions like virtual-enterprise workflows, electronic-commerce agents and brokers, 
etc. should rekindle the community’s interest in federations that span widely 
distributed, highly heterogeneous component systems while also requiring a hig- 
hly dependable IT infrastructure and thus highly consistent data. The problems 
of federated information systems are inherently hard, but we should not give 
up too early and rather pursue long-term efforts towards well-founded but also 
practically viable solutions. This paper should be understood as a first step along 
these lines. In particular, we have aimed to incorporate important aspects of com- 
mercial database systems into a systematic and rigorously founded approach to 
federated transaction management. 
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Proofs of Theorems 

Proof of Theorem 1 

a) With the (SI-V) property as our premise and our overall assumption that 
each write in a transaction must be preceded by a read of the same object, 
the SI version order <Cs is the only version order that can render the MVSG 
of a schedule acyclic. The theorem then follows immediately from the standard 
MVSR theorem (see Subsection 2.1), because SI-MVSG contains the same edges 
as the ordinary MVSG. 

b) “if”: 

Assume that the schedule is not SI. Then there must be at least two concurrent 
transactions ti and tj that write the same object x. Because of the read-before- 
write restriction, both transactions read x before writing it, say ti reads x^ and tj 
reads xi. By the definition of the version function, tk must have been committed 
when ti started, so Xk Xi by the definition of the SI version order <Cs. The 
same holds for ti and tj, so we obtain xi <Cs Xj. Additionally, tk must commit 
before tj , because ti reads from tk (so tk was committed when ti started) and tj 
runs in parallel with ti, so it commits after the start of U. This yields Xk <Cs Xj 
and, analogously, xi <Cs Xi. The SI-MVSG now contains the edges 

— ti ^ tj labeled with “x”, because ri{xk), Wj{xj) and Xk <Cs Xj, and 

— tj — >■ ti labeled with “x”, because rj{xi), Wi{xi) and xi <Cs Xi. 

But this is an x-cycle and therefore a contradiction to the precondition. 

“only if”: 

Assume there is an x-cycle ti^ — >■ ti^ — >■ ... — >■ ti,, = ti^ in SI-MVSG of the sche- 
dule s, s € SI. If there is more than one, select one with minimal length. Without 
loss of generality, assume that the transactions in the cycle are renumbered such 
that ti — y t2 — ^ ... — ^ tji = ti. 

We show first that there must be one edge in this cycle that was added due 
to the (RW)-rule in the definition of SI-MVSG. Both the (WR)-rule and the 
(WW)-rule add edges from a transaction that writes an older version of x to a 
transaction that writes a younger version of x. Because the SI version order <C« 
is defined by the commit order of the transactions, one of those edges from ti 
to tj means that ti commits before tj, formally Ci <C Cj. But we have a cycle. 
This would mean that Gi <C Gi, which is a contradiction, so there must be at 
least one edge U — >■ where Ci ^ Gj+i, we choose the first such edge. 

This edge was added because we have ri{xi) and Wi_|_i(a;j+i) in the schedule, 
and Cl < Ci+i. We also have Ci <C Bi and there is no other commit of a 
transaction in between them that writes x, because ti reads x from ti. Putting 
this together, we obtain Ci Bi Ci+i <C Ci. We have thus shown that Ci and 
Ci+i run concurrently. However, this does not mean that the schedule is not in 
SI, because ti does not necessarily write x. The read-before-write rule means that 
tij-i must read a version Xp of x before writing it. To show a contradiction, we now 
discuss the different possibilities for the transaction tp from which transaction 
ti+i can read. 
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(i) If fi+i reads from a transaction that committed before ti, we get Cp <C 
Bi^i <C Cl- But this means that ti and U+i are concurrent and write the 
same object x, so the schedule s cannot be in SI, which is a contradiction. 

(ii) If ti+i reads x from C/, we have I = p and C; <C Bi+i <C Cj+i. This means 

that there is an x-edge in the SI-MVSG from ti to ti+i because of the 
(WR)-rule. But then ti cannot be part of the cycle, or we could replace the 
edges ti ^ ti ^ ti+i by the edge ti — >■ and would get a shorter cycle, 

contrary to the minimality of the selected cycle. 

So we know that ti has an incoming x-edge, and that this edge does not 
come from ti, which is ti’s only incoming x-edge of (WR) type (ti reads x 
only from ti). Therefore the incoming x-edge must be an edge of type (RW) 
or (WW). But the definition of the SI-MVSG says that ti must write x in 
order to have such an incoming edge, so we have shown that the concurrent 
transactions ti and ti+i both write x, which is a contradiction. 

(iii) If reads x from a transaction tq that committed after ti started, we 
have the ordering Ci Bi Cq Bi-i-i «C Ci+i <C G^. If U was not part 
of the cycle, we could show as before that U must write x, so that s would 
not be in SI, and ti is on the cycle. 

If there is no other transaction in between ti and tq that writes x, the graph 
contains the edges ti — >■ tq {tq reads x from ti) and tq — >■ ti+i (ti+i reads x 
from tq). If we replace the edges ti ^ ti ^ ti+i by the edges ti ^ tq ^ ti+i, 
we get a cycle with the same length. Additionally, we can replace the edge 
from a larger to a smaller version {ti — >■ ti+i) by edges that respect the 
version order {Ci Cq Gi+i). As shown before there must be another 
version-order-reversing edge in the cycle, so we can restart the proof at 
the beginning. Since the cycle has finite length, we can do so only a finite 
number of times until one of the other cases applies. 

If there is a sequence of transactions between ti and tq that write 

X, tri must read x from t/, tr^ must read x from tr^, and so on, until tq must 
read x from Therefore there are edges ti — >■ and tr^rightarrowU^j^^ 

(because rr.^{xi), Wi.^.^{xi.^.^) and Xi^^Jlsxi). Now we can replace ti — >■ 
tij fq+i by ti — >■ fq+i which are edges that respect the version 

order, and the same argument as before applies. □ 



Proof of Theorem 2 

We show that the global schedule is SI by showing that both SI properties (SI-V) 
and (SI-W) hold for the global schedule. 



(SI-V): Assume that there is a transaction ti in the global schedule that reads 
a version of object x in database DBi that another transaction tj wrote, but 
that tj is not the “right” transaction in the sense of (SI-V). Then tj is either 
uncommitted, committed after the begin of ti, or another transaction tk wrote 
X and committed between the commit of tj and the begin of U. But the begin 
and commit operations in all databases are synchronized, so if any of these three 




24 



R. Schenkel et al. 



cases holds globally, it does also hold in DBi . Therefore the local subschedule in 
DB[ does not satisfy (SI-V), contrary to the assumption that all local schedules 
are SI. 



(SI-W): Assume that there are two transactions ti and tj in the global schedule 
that execute concurrently and that write an object x in database DBi. Because 
the begin and commit operations in all databases are synchronized, and 
are executing concurrently, too. But then the local schedule is not SI, which is 
a contradiction. □ 



Proof of Theorem 3 

“There are no ...” ^ “(SI-V) holds”: Assume (SI-V) does not hold. Then 
there is a transaction ti that reads a version of an object x that is either “too 
old” or “too new”, with x being part of database DB[. All reads and writes of 
X are therefore part of subtransactions in this database. 

If the version is too old, ti does not read x from the last transaction tj that 
wrote X and committed before U started. So there must be another transaction 
tk that committed in between: Cj <C Ck «C Bi, and both tj and tk write x. 
The subtransaction of transaction ti in DBi cannot begin before the begin 
of ti itself, so Bi <C Bf \ With the AGP assumption, together this yields <C 
cP <C bP . This means that tP does not read x from the last transaction 
that committed before tP started in the local database system, but this is a 
contradiction to DBi guaranteeing SI for all local subtransactions. 

If the version is too new, ti reads x from a transaction tj that was not 
yet committed when ti globally started. Because the local schedulers guarantee 
SI, tP must have been committed when tP executed its first operation, so 

CP <C bP , and therefore t^ <C tP . On the other hand, there must be at least 

ik) ik) 

one database DB^ where t\ started before tj committed, because globally ti 
started before tj committed, so Bp^ <C ■ Together with Bp'^ <C Cp'’ and 
Cj <C Ci this yields tp^ || tp\ which is a contradiction. 

“(SI-V) holds” ^ “There are no ...”: Assume that there are databases DBi 
where tP <C tP and reads x from tP , and DBk where tp'^ || tp\ Because tj 
reads from ti in DBi, we have bP <C cP <C bP . In database DBk, and tp^ 
run concurrently, so that either Bp'' <C Bp'' <C Cp'^ or Bp'^ <C Bp'^ <C Cp\ 

Whatever case applies, we see that tp^ begins before tp~^ commits, so that the 
global transactions ti and tj run concurrently. This means that tj reads x from a 
globally concurrent transaction, which is a contradiction to the version function 
satisfying (SI-V). □ 
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Proof of Theorem 4 

By Theorem 3, we already know that s satisfies (SI-V). To show that s also 
satisfies (SI-W), we use the characterization by the SI-MVSG that we introduced 
in Section 2.2. The SI-MVSG of the global schedule is the union of the SI- 
MVSGs of the local subschedules. The SI property of the global schedule then 
follows immediately from the SI property of the local subschedules: If all local 
subschedules are SI, there is no x so that there is an x-edge in one of the local 
SI-MVSGs. But since an object exists in exactly one database, there is no x-edge 
in the global SI-MVSG, too. So by Theorem I the global schedule is SI. □ 
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Abstract. Transaction management comprises different aspects such as 
concurrency control, recovery control, and replication control. Usually, 
only one or at least two of these aspects are considered in theories of 
transaction management — the other aspects are ignored. In this paper, 
we propose a model of executions that allows to capture all three aspects 
of transaction management. Based on this execution model, we present 
a definition of serializability. Then, we show how the requirement of se- 
rializability can be decomposed into requirements that can be attributed 
to concurrency control, to replication control, and to recovery control, 
respectively. Altogether, we obtain a unified theory of transaction mana- 
gement, where we focus on concurrency control and replication control 
in this paper. 



1 Introduction 

Basically, a concurrent execution of some set of programs is called serializable 
if each of the programs appears as if it has been executed atomically without 
interference of the other programs. Therefore, the outcome of a serializable exe- 
cution is the outcome of a sequential execution of the programs one after the 
other in some order. In this context, the programs are often called transactions. 
So, serializability provides an abstraction of atomic execution of transactions on 
top of a concurrent non-atomic execution of transactions in reality. 

Similarly, a consistency model for the access to replicated data in a distributed 
shared memory system guarantees that a concurrent execution of a program 
appears as if there has been exactly one copy of each object though there have 
been many copies in reality. So, a consistency model provides an abstraction of 
a conventional memory on top of a distributed shared memory with replication. 

For both abstractions, serializability and consistency, there exist a lot of theo- 
ries. However, theories for serializability most time presuppose an underlying 
conventional memory, and theories for distributed shared memory systems most 
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time are not interested in transactional aspects. Though there are some appro- 
aches to deal with serializability in the context of data replication, these appro- 
aches presuppose a particular memory model. For example, Bernstein et al. [2] 
assume that there is a fixed set of copies for each object and that copies are 
updated at write-time. This excludes dynamic creation of new copies, invalida- 
tion of copies, and creation of new copies on request — i.e. at write-time or at 
read-time. 

In this paper, we propose a model for concurrent executions^ of transac- 
tions which allows to define serializability independent of an underlying memory 
or consistency model. This definition does not even make a difference between 
replicated data and non-replicated data. There will be only one definition of 
serializability which applies to conventional memory as well as to multi-version 
serializability and serializability in replicated databases. In a second step, we 
will show that serializability can be split into different requirements — one con- 
cerning correct scheduling (synchronization) of access operations, one concerning 
consistency of accessed copies, and another concerning recovery control. We call 
the requirement concerning synchronization of access events concurrency control 
requirement and the requirement concerning consistency of accessed copies re- 
plication control requirement. The splitting into a concurrency control part and 
a replication control part helps to implement a scheduler responsible for the con- 
currency control part and the memory manager responsible for the replication 
control part independently of each other. Therefore, this splitting can be seen as 
an interface for correct interaction between memory managers and schedulers [6] . 
Moreover, the splitting provides a clearer understanding of the two abstractions 
atomicity and consistency mentioned above. In a sense, our model complements 
the approach of Schek et al. [11] which provides a unified model for specifying 
correctness for concurrency control and recovery control. Our model additionally 
includes replication control and does not require compensatable write events. 

The model of concurrent executions proposed in this paper basically consists 
of a partial order (a causality) of read and write events to some objects — as usual 
in serializability theory (e.g. [2,5]). In addition, we represent the propagation of 
values between read and write operations explicitly in each execution by a so- 
called data causality [4] . This is necessary because the values returned by a read 
event can no longer be deduced from the order of the access events when no 
specific memory model is presumed. Basically, our execution model differs from 
the classical one by its explicit representation of the reads-from relation. Still, 
there are some differences between data causality and the classical reads-from 
relation which will be discussed in Sect. 2.4. 

The goal of this paper, is to stipulate a discussion on the execution model 
and the corresponding definition of serializability. Therefore, we concentrate on 
the basic idea and on a careful motivation of the model rather than on its pre- 
cise technical definition. The precise definitions and formal proofs of the results 
mentioned in this paper can be found in [8,6,7]. 

^ In the context of concurrency control, executions are often called histories or sche- 
dules. We will use the name execution throughout this paper. 
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2 The execution model 

In this section, we introduce and motivate our execution model which will be 
the formal basis for defining serializability in Sect. 3. 

2.1 Events and causality 

Basically, an execution consists of a set of events and a partial order on these 
events. The partial order indicates causal dependencies between the events and 
is therefore called causality. In our context, there are read and write events to 
some fixed set of objects and there are commit events. 

Figure I shows an example of a partial order of events, which consists of 
three transactions: two committed transactions and one uncommitted transac- 
tion. Graphically, each event is represented by a square. An inscription represents 




Fig. 1. A partial order of events 



the type of event: R[x] stands for a read event on object x, W[x] stands for a write 
event on object x, and C stands for a commit event. 

2.2 Transaction causality 

In Fig. 1, there are three transactions which are indicated by dashed boxes. These 
dashed boxes, however, are not part of the formal execution model. For defining 
serializability, the execution model must provide some explicit information on 
the involved transactions and on which event belongs to which transaction. 
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In our execution model, we use a distinguished causality, the so-called tran- 
saction causality, for distinguishing different transactions and assigning events 
to these transactions. Figure 2 shows the transaction causality corresponding to 
the dashed boxes in Fig. 1. In order to distinguish transaction causality from 



R[x]f 



W[y] 



R[y] 

W[x] 

R[x] 




R[x] 



W[x] 



R[x] 



Fig. 2. Transaction causality 



other causalities, we graphically represent it by arrows with a white arrow head. 
Note that transaction causality only represents causal dependencies within a 
transaction (which are represented in the code of the transaction itself). There 
are no transaction causalities between different transactions; but there may be 
other causal dependencies (for example, imposed by a transaction manger which 
schedules the different events of the transactions). In turn, we adopt the view 
that two events belong to the same transaction if and only if they are somehow 
connected by transaction causality. 

In the rest of this paper, we will also assume that each transaction is se- 
quential and has at most one commit event. This restriction, however, is not 
necessary but slightly simplifies the presentation of our ideas. 



2.3 Data causality 

As stated before, we do not presume a particular memory model on which the 
transactions are executed. Therefore, we cannot deduce the values returned by a 
read event from the order of the executed events. In the presence of different co- 
pies for the same object, it might happen that a read event returns an older value 
than written by an intermediate write event. For example, the read event might 
access an old copy. Actually, this happens in multi-version databases without 
doing any harm. 
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Therefore, we represent the propagation of values between read and write 
events (on the same objects) explicitly in our execution model by data causality. 
Figure 3 shows an example of data causality for the events of our previous 
example. Data causality is represented by bold-faced arrows. For example, we 



R[x] 



W[y] 




Fig. 3. Data causality 



can deduce from this data causality that the read event on object y of transaction 
B reads the value written by transaction A. Indeed, this should not happen in a 
serializable execution because a committed transaction should not read a value 
written by an uncommitted transaction. This will be excluded in the definition 
of serializability but not in the execution model itself (see Sect. 3). 

The read event on object x of transaction B reads the value which results 
from successively^ performing the write event of transaction C on x and the write 
event of transaction B on x. Note that the second read event of transaction C 
does not read the value of the write event of transaction B, though it happens 
immediately before according to the causality in Fig. 1. Thus, the second read 
event of transaction C reads from an ‘older copy’ of x which is implicitly encoded 
in the data causality (by a split of data causality at the write event of transaction 

C). 

Data causality must satisfy two requirements in order to have an intuitive 
understanding. First of all, two events which access different objects should never 
be related by data causality because data causality was supposed to represent the 
propagation of data between read and write events on the same object. Second, 

^ We do not claim that the read event returns a value written by a single write event. 
The reason is that we do not require that write events are total. Therefore, the value 
returned by a read event may be the outcome of a sequence of write events. See 
Sect. 2.4 for more details. 
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data causality may branch forward (which corresponds to the creation of copies) 
but there should be no backward branching (merge of data causality). In the 
latter case, we would have difficulties in interpreting the returned value of a 
read event as the outcome of a sequence of write events. By excluding backward 
branching of data causality, we need not deal with this problem. But, we will 
weaken this restriction in Sect. 5. 

Altogether, an execution is a partial order of read, write, and commit events 
where two causalities are distinguished from other causalities: transaction cau- 
sality and data causality. Figure 4 shows the integrated view of the execution 
which has been introduced step by step previously. 




2.4 Discussion 

Representing an execution as a partial order of write, read, and commit events is 
a classical approach [2,5]. Usually, the correspondence between an event and the 
transaction it belongs to is represented by some index. We represent it implicitly 
by transaction causality. This, however, is not fundamental and does not make 
a big difference. We have chosen transaction causality because it nicely fits to 
the causality based setting. 

The really new part in our execution model is data causality which explicitly 
represents the propagation of values between read and write events. This way, we 
are able to formalize serializability without presupposing a particular memory 
model — indeed, the memory model is in the execution itself. In particular, a 
single concept of serializability covers all classical definitions of serializability for 
conventional memory, multi-version databases, and databases with replication. 
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This definition of serializability formally captures all aspects of concurrency 
control, replication control, and recovery control. 

Of course, there is a relationship between data causality and the reads-from 
relation: Both concepts represent how values are propagated between write and 
read events. However, there are some significant differences: 

1. Data causality is an integral part of an execution whereas the reads-from 
relation is a derived concept: Given an execution and some underlying me- 
mory model, the reads-from relation can be derived. Since we are interested 
in a definition of serializability independent from a particular memory mo- 
del, the reads-from relation can no longer be derived. Therefore, either the 
reads-from relation or data causality must be an integral part of the execu- 
tion model. 

2. Data causality is a transitive relation whereas the reads-from relation is not 
transitive. The transitivity of data causality has two benefits: 

First, there exists a bunch of techniques for reasoning with and about cau- 
salities which heavily exploit transitivity. For example, these techniques can 
be employed for verifying correctness of transaction protocols and data con- 
sistency protocols (modelled as a special kind of Petri nets [4,9]). These 
techniques, however, are beyond the scope of this paper. 

Second, transitivity of data causality allows not only to deal with the classi- 
cal read-write-model (RW-model) for databases where each write operation 
completely overwrites previously written values. By transitivity of data cau- 
sality, we can also deal with partial write operations and the action-model (A- 
model) which also allows partial write operations, atomic read-and-modify 
operations (e.g. increment or decrement), where a part of a previously writ- 
ten value is retained after the modification operation. For an example of a 
partial write operation, consider an object which consists of two components 
(e.g. a record or a tuple) and two write events such that one write event chan- 
ges one component and the other write event changes the other component. 
Then, a read event does not return a value written by any of these write 
events. Rather, the read event returns a mixed value. This situation can oc- 
cur in SQL transactions which update different attributes of the same tuple 
independently of each other and later on select this tuple. This situation is 
not formally captured by the reads-from relation but it is captured by data 
causality (cf. the situation in Fig. 4 discussed before). One might argue that 
we should choose the components as individual objects and deal with their 
consistency separately in the above example. Then, the above problem would 
not occur. The granularity of replication, however is often fixed and cannot 
be freely chosen. Therefore, a model of replication must properly deal with 
write operations which only partially change an object (called partial write 
operations for short). 

Data causality allows to formalize serializability not only for the RW-model 
but also for the A-model and, in particular, for partial write resp. update 
operations provided by SQL. 

3. The definition of the reads-from relation for an execution makes assumptions 
on the underlying recovery manager. It assumes that values written by an 
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uncommitted transaction are ignored somehow. Data causality allows to be 
more explicit in this point. For example, the write event of transaction A in 
Fig. 4 is not ignored by transaction B — there is a data causality from the 
write event of transaction A to the read event on object y of transaction B. 
Having this clearly illegal behaviour in the execution model makes it possible 
to explicitly exclude it in the specification of serializability. This way, correct 
recovery is also formally captured by the definition of serializability. The 
relation to the approach of Schek et al. [11] will be discussed in Sect. 3.3. 

Altogether, the execution model with data causality allows a definition of se- 
rializability which covers all aspects of concurrency control, replication control, 
and recovery. Actually, the execution model can be equipped with some more 
features, which will be discussed in Sect. 5. These features, however, are not 
relevant for understanding the basic idea of our approach. 

Now, where do the executions come from? There are two answers to this 
question: First, an execution is an abstract representation of what is really going 
on when some transactions are executed by a database management system. 
We do not bother where the executions come from. We only define whether 
the execution is considered to be correct (i.e. to be serializable) or whether 
it is considered to be incorrect. Second, we can model a database management 
system by the help of a special kind of Petri nets. Each such model has a precisely 
defined set of executions. Then, we can use the techniques proposed in [9] for 
verifying that all executions of the modelled database management system are 
serializable. In this paper, however, we are mainly interested in the specification 
of serializability. We do not model and verify database management systems 
here (see [8] for more details on modelling and verifying database management 
systems) . 

3 Serializability 

In the previous sections, we have introduced and motivated our execution model. 
Now, we will give a definition of serializability for this execution model. Again, 
we refer to [8,6] for precise definitions and rigorous proofs of the results. Here, 
we focus on the basic idea. 

3.1 Definition 

As already mentioned in the introduction, the basic idea of serializability is 
the following: An execution is serializable if all committed transactions can be 
arranged in some sequence in which they could have been executed one after 
the other on a conventional memory with the same outcome — i.e. in which each 
event reads respectively writes the same value as the corresponding event in the 
original execution. In particular, the values written by uncommitted transactions 
do not affect the committed transactions. 

Now, we formulate this idea in terms of our execution model without assu- 
ming a particular underlying memory. Let us start with the requirement that 
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write events of uncommitted transactions do not affect committed events (re- 
covery): We say that data causality respects commit events if there is no data 
causality from a write event of an uncommitted transaction to an event of a com- 
mitted transaction. In the execution from Fig. 4, data causality does not respect 
commit events due to the data causality from the write event of transaction A 
to the read event on object y of transaction B. If we add a commit event to 
transaction A as shown in Fig. 5, data causality respects commit events. Note 
that we do not fix a particular scheduling strategy in order to guarantee that 
data causality respects commit events. Indeed, this requirement could be imple- 
mented by quite different protocols. For example, it could be implemented by 
pessimistic recovery protocols (allowing strict schedules only) as well as by opti- 
mistic recovery protocols (allowing non-strict schedules, but requiring cascading 
aborts) [2]. 



R[x] 



W[y] 



C 




Fig. 5. A serializable execution 



The execution from Fig. 5 does not only respect commit events, it is even se- 
rializable. This can be checked by arranging the committed transactions (all 
transactions in this example) in a linear order as shown in Fig. 6. In this linear 
arrangement, transaction causality and data causality are not changed — other 
causalities, however, have been omitted because these causalities are irrelevant 
for serializability. Since we did not change transaction causality and data cau- 
sality, the values written and read by the events are still the same than in the 
original execution. In order to check serializability, we need to show that the 
values returned by read events in this execution (according to data causality) 
are the same as in a sequential execution of the events in the order of the linear 
arrangement on a conventional memory: We say that data causality is compatible 
with the linear arrangement if for each pair of a write event e on some object 
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Fig. 6. A linear arrangement 



X and a read or write event e' on the same object x there is a data causality 
from e to e' if and only if e occurs before e' in the linear arrangement. This 
requirement is graphically represented in Fig. 7 where a write or read event on 
object X is represented by X[x] and the ‘if-and-only-if’ requirement is split into 
two implications. If data causality is compatible with a linear arrangement, the 
events write and read the same values as on a conventional memory. This can 
be shown by a simple induction argument on the linear arrangement which ex- 
ploits the interpretation of data causality given before. A formalization of this 
interpretation can be found in [7]. 

Note that compatibility does not require that data causality goes only from 
top to bottom (with respect to the linear arrangement): For two read events on 




36 



E. Kindler 




Fig. 7. Graphical representation of compatibility 



the same object, data causality may run from bottom to top. But, data causality 
may not run from bottom to top between a write and a read event; it must run 
from top to bottom between a read and a write event. 

Altogether, an execution is serializable if 

1. there exists a linear arrangement of all committed transactions such that 
data causality is compatible with this linear arrangement and if 

2. data causality respects commit events. 

3.2 Serializability theorem 

Similar to classical theory, we have defined serializability by the help of a linear 
arrangement of transactions. This kind of definition is appropriate for understan- 
ding its purpose and its intention. But, this kind of definition is not appropriate 
for checking whether an execution is serializable because all possible linear arran- 
gements must be checked for ‘compatibility’. In this section, we give an equivalent 
definition of serializability in terms of a so-called precedence relation which is 
similar to the classical serialization graph [2]. 

The precedence relation is defined on all committed transactions of an execu- 
tion and says which transaction must be arranged before another transaction in 
a compatible linear arrangement. Since we assume that each committed transac- 
tion has exactly one commit event, we define the precedence relation on commit 
events. A transaction A must precede another transaction B in the following two 
cases: 

1. There is a write event in transaction A which has a data causality to some 
event in transaction B. In that case, transaction B is affected by transaction 
A. Therefore, transaction B must be arranged after transaction A. 

2. There is a read or write event e on some object x in transaction A and a 
write event e' on the same object x in transaction B such that there is no 
data causality from e' to e. In that case, transaction A is not affected by 
this particular write event of transaction B. Therefore, transaction B must 
be arranged after transaction A. 
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The two situations for the definition of a precedence relation from A to B are 
graphically represented in Fig. 8, where the required precedence is shown by 
dashed arrows between the corresponding commit events. 





Fig. 8. The precedence relation 



Now, it can be shown that an execution is serializable if and only if, 

1. the precedence relation has no (non-trivial) cycles, 

2. data causality respects commit events, and 

3. data are correctly propagated within a single transaction; i.e. for each write 
event e on some object with a transaction causality to some read or write 
event e' on the same object there must be also a data causality form event 
e to event e' (cf. second read event on object x of transaction C in our 
example) . 



3.3 Discussion 

Readers familiar to the classical serializability theorem might be puzzled about 
the three items in the equivalent characterization of serializability because there 
is only one (acyclicity of the serialization graph) in the classical serializability 
theorem. The reason is the slightly more general scope of our definition of se- 
rializability. The second item deals with the correct recovery of aborted (not 
committed) transactions which is ignored in classical serializability theory. The 
third item deals with correct propagation of values between access operation wit- 
hin a single transaction — this issue is also often ignored in classical approaches 
(or multiple access to the same object is even forbidden). 

In practice, however, multiple access to the same object does occur and cor- 
rect recovery is crucial for a correct operation of a database system. Of course, 
these issues are not ignored in implementations, but in theory. In implemen- 
tations, mistakes might occur in such subtle points. Therefore, an appropriate 
theory should cover these points. 

Usually, serializability theory starts with the definition of an equivalence on 
executions. Then, an execution is defined to be serializable if there exists an 
equivalent serial execution. In our definition, we do not explicitly define an equi- 
valence on executions. Rather, we arrange the transactions in some linear order. 
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Since we keep data causality and transaction causality, the values written and 
read do not change. We rather check that the execution of the events in the li- 
near arrangement could have been on a conventional memory (i.e. a sequentially 
consistent memory [10] which can be also defined for our execution model [7]). 
This requirement is captured by the concept of compatibility of data causality 
with the linear arrangement. This way, we need not implicitly encode the con- 
cept of an underlying classical memory in some notion of equivalence but we can 
explicitly represent it in the definition of compatibility (resp. in the definition of 
sequential consistency) . 

One goal of our execution model was the definition of serializability which 
formally captures the aspect of recovery. This was also the goal of the work of 
Schek et al. [11]. The difference, however, is that executions need to be expanded 
in [11]. Basically, each abort event is replaced by a sequence of undo events (one 
for each write event of the corresponding transaction) and a commit event. This 
expansion corresponds to a particular recovery strategy. In particular, the undo 
operations are supposed to be executed atomically, i.e. without interference of 
events of other transactions. In our approach, we only need to require that 
there is no data causality from an uncommitted transaction to a committed 
transaction. This allows for different recovery strategies. For example, a backup 
to the object’s before-image. In contrast, Schek et al. [11] assume that each write 
operation can be compensated by an inverse write operation. 

4 Separation of concerns 

In the previous section, we have introduced two equivalent definitions of seria- 
lizability. Usually, this overall requirement of serializability is guaranteed by a 
combination of different modules of a database management system: for exam- 
ple, a scheduler, a memory manager, and a recovery manager. The scheduler 
is responsible for a correct synchronization of read and write events of diffe- 
rent transactions (concurrency control). The scheduler, however, does not deal 
with the propagation of values between read and write events; this is the task 
of the memory manager (replication control) . The recovery manager guarantees 
correct backup for aborted or crashed transactions. Serializability defines the 
overall correctness for the different modules. 

In this section, we split serializability into different requirements concerning 
different modules. If each requirement is guaranteed by the corresponding com- 
ponent, serializability is guaranteed for the complete database management sy- 
stem. The requirement concerning the scheduler basically is conflict serializabi- 
lity [2] (in its classical definition)^ and the requirement concerning the memory 
manager is weak coherence [3,1]. Weak coherence is a much weaker consistency 
model than sequential consistency and, therefore, it can be implemented in a 
more efficient way. Most interestingly, we need not require sequential consistency 

® Conflict serializability is defined in terms of the order of events and not in terms of 
propagated data. Therefore, conflict serializability deals with correct synchronizati- 
ons of events, only — the task of a scheduler. 
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for the memory manager though this might be expected from the definition of 
serializability. 

If the scheduler guarantees conflict serializability, the memory manager gu- 
arantees weak coherence on committed read and write events, and the recovery 
manager guarantees that data causality respects commits, serializability is gua- 
ranteed (see [8,6,7] for details). 

4.1 Weak coherence 

We start with a reformulation of weak coherence for our execution model (see 
[9,7] for details). An execution is weakly coherent if for each write event e on 
some object x and each read or write event e' on the same object x which 
happens causally after e there is also a data causality from e to e'. In the context 
of transactions and serializability, we only impose this requirement on events 
of committed transactions. Figure 9 shows a graphical representation of this 
requirement. 




Fig. 9. Definition of weak coherence 



4.2 Conflict serializability 

Conflict serializability of an execution is defined by the help of the serialization 
graph on the committed transactions. Again, we represent the serialization graph 
as a relation on the commit events. Let us consider two access events e and e' on 
the same object and at least one is a write event. Then, there is an edge in the 
serialization graph from the transaction of e to the transaction of e! if e happens 
causally before eb This requirement is graphically represented in Fig. 10, where 
an edge of the serialization graph is represented by a dotted arrow. An execution 
is conflict serializable if the serialization graph has no (non-trivial) cycles. 

Note that the execution from Fig. 5 is not conflict serializable due to the write 
event on object x of transaction B which happens causally between two access 
events of transaction C. But, it is serializable. This shows that the splitting into 
several parts is stronger than serializability. 

On the other hand, there may be executions which are conflict serializable 
but which are not serializable (e.g. because the execution is not weakly coherent). 
All three properties in combination, however, guarantee serializability. 
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Fig. 10. Definition of the serialization graph 



5 Extensions 

In the previous sections, we have introduced an execution model which allows 
to define serializability such that the definition covers all aspects of transaction 
management which are relevant for correctness. The emphasis of the presentation 
was on the motivation and on the basic idea of the execution model. Next, we 
will briefly discuss some extensions of the model. These extensions have already 
been worked out [ 7 ] and are only omitted for simplicity in this paper. 



5.1 Update causality 

For simplicity, we have assumed that a write operation only modifies a single 
copy of an object in Sect. 2 . In some situations, however, it might be necessary 
to propagate the value of a write event to other copies and to update these copies 
accordingly. To this end, we introduce a third kind of causality which is called 
update causality. It is strongly related to data causality but yet different. 

Figure 11 shows an example where first some value is written to object x by 
event ei- Then, the object is split into two copies (a ‘left’ and a ‘right’ one). 
On the left copy, a write event 62 and a read event 63 is performed, on the 
right copy two read events 64 and Cq are performed. The value written by 62 is 
also propagated to the right copy. The update event 65 updates the value of the 
right copy accordingly. Update causality is represented by arrows with a bold- 
faced arrow tip but non-bold-faced lines. This indicates the close relation to data 
causality on the one hand and its difference from data causality on the other 
hand. The update event on a copy is represented by a box inscribed by U[x]. 
Thus, the read event eg returns the same value as 63; viz. the value resulting 
from the two write events ei and 62- 

Note that, in contrast to read, write, and commit events, an update event is not 
invoked by a transaction. Rather, an update event is invoked by the memory 
manager. Still, update events are present in the execution model in the same 
way data causality is present, in order to indicate the propagation of data. The 
update event precisely indicates the point at which a copy is updated. Therefore, 
there is a data causality and an update causality to an update event. This way, 
update causality in combination with update events allows a controlled way of 








Serializability, Concurrency Control, and Replication Control 



41 



W[x] 



R[x] 




^W[x] 



R[x] 



U[x] 



|R[x] 

Fig. 11. Update causality 



merging different values — which was not possible with data causality because 
data causality does not branch backwards. 

Update causality even allows sequences of write events to be updated by 
a single update event. For an interpretation of an update of a sequence of 
write events, let us consider read event 67 of Fig. 12 . This read event reads 
the value which is the outcome of a sequential execution of the write events 
61,63,62,64,66 — where 62 and 64 are instantaneously updated by update event 
65. Basically, we just insert the sequence of write operations of the update cau- 
sality preceding the update event for the update event itself. One operational 
realization of update causality could be a log-file for all values written to some 
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Fig. 12. Interpretation of update causality 
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object, which are later on updated instantaneously. Actually, we can use update 
causality to model recovery from system crashes by log-files. 

With this interpretation of update causality, the definition of serializability 
can be easily transferred to this extended execution model. We only need to 
adapt the definition of compatibility of a linear arrangement of transactions 
with data causality and update causality (cf. [7]). 

5.2 Compensation operations 

Our execution model as well as the definition of serializability is completely 
syntactical; i.e. there are no assumptions on the value written by the write 
events. In particular, there are no assumptions on the relation of values written 
by two different write events. This corresponds to the classical point of view in 
serializability theory. 

Sometimes, however, it is convenient — or even necessary — to have a more 
semantical model of write events. In the context of compensating transactions, 
for example, each write operation has an inverse that undoes the write operation. 
More generally, there may be a sequence of write events which undo a complete 
transaction. This can be formalized by an equivalence relation on sequences of 
write events (e.g. in an algebraic setting). 

The technique of equivalent sequences of write events is orthogonal to the 
technique presented in our paper. Therefore, both techniques can be combined: 
For each read event, data causality (and update causality) defines a unique 
sequence of write events; the read event returns the effect of this sequence. If 
we additionally impose the equivalence relation on this sequence, we have a 
combined theory of serializability. Technically, the equivalence relation must be 
incorporated into the definition of compatibility of a linear arrangement with 
some execution (see [7] for details). 

Here, we do not present the formal details of the combination of both techni- 
ques but pose the question whether a combination is worthwhile. We guess that 
it is for the following reason: Of course, the purely syntactical model cannot deal 
with semantical compensations. On the other hand, a write event inherently does 
not have an inverse. Therefore, a compensation based theory for recovery needs 
to add some artificial parameters to write events, where the additional para- 
meter basically records the value of the object prior to the write. A combined 
theory allows to represent both aspects, compensation and syntactical recovery 
by before images, in an appropriate way. 

6 Conclusion 

In this paper, we have presented an execution model which allows to define 
serializability independently of a particular memory model. The definition of 
serializability captures multi-version databases as well as databases with repli- 
cation. Moreover, this definition does not only cover the aspect of concurrency 
control but also covers the aspects of memory management and recovery. 
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This slightly generalized definition of serializability has the following benefits: 

1. First of all, it provides a clearer understanding of the involved concepts and 
their relation. 

2. Up to now, the definition of serializability had to be adjusted to new memory 
models. Our definition is independent from a specific memory model since 
the execution model explicitly deals with the propagation of data. 

3. Since there is a single concept of serializability which covers all aspects, there 
is a formal correctness condition for all modules of a database system. This 
helps to avoid mistakes due to incorrect interplay of different modules. In 
particular, correctness of each module can be verified by the help of the 
techniques presented in [9]. The Separation Theorem guarantees the correct 
interplay [6]. 

4. The separation into requirements concerning concurrency control and repli- 
cation control shows that a large range of modules can be combined with 
each other. Every scheduler which guarantees conflict serializability (for non- 
replicated data) can be combined with every memory manager which gua- 
rantees weak coherence — without further verification necessary. Though we 
did not introduce new algorithms or protocols for transaction management, 
we have indicated new combinations of protocols for schedulers and memory 
managers. 

5. The definition of serializability does not only capture the usual read- write- 
model but also captures the action-model, which allows partial updates by 
write operations and atomic modification operations such as increment or 
decrement. Though partial write operations do occur in real SQL statements, 
partial write operations are often ignored in transaction theory. 

Up to now, we have used our causality based specification technique for specify- 
ing, modelling, and verifying classical transaction protocols and for investigating 
the interplay between classical memory models and classical transaction models. 
Since these classical models are well-understood, our technique might appear to 
be of minor relevance for practice. However, the same techniques can be used 
for reasoning on new transaction models and new memory models. Since the 
interplay between new memory models and new transaction models is not so 
well-understood, our techniques might be useful in this area. 
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Abstract. The rule-based update language ULTRA has been designed 
for the specification of complex database updates in a modular fashion. 
The logical semantics of update goals is based on update request sets, 
which correspond to deferred basic updates in the database. The decla- 
rative character of the logical semantics leaves much freedom for various 
evaluation strategies, among them a top-down resolution, which can be 
mapped naturally onto a system of nested transactions. In this paper, we 
extend this operational model as follows: Not only the basic operations 
are performed and committed independently from the top-level transac- 
tion, but also complex operations defined by update rules. This leads to 
an open nested transaction hierarchy, which allows to exploit the seman- 
tical properties of complex operations to gain more concurrency. On the 
other hand, high-level compensation is necessary and meta information 
must be provided by the programmer. We present the key elements of this 
combination of logic-based update languages and transaction processing 
and propose a flexible system architecture. 



1 Introduction 

In [18,19] we present the rule-based update language ULTRA. In rule-based up- 
date languages as ULTRA or Transaction Logic [4,5], updates are implemented by 
goals, and rules can be used to define complex operations to be reused in other 
goals. The concept is similar to procedures/functions in classical programming 
languages. Thus, it is possible to build arbitrarily complex database operations in 
a modular fashion. In ULTRA, constructs for concurrent composition, sequential 
composition, and bulk updates are provided, which subsume the classical data- 
base operations. The logical semantics of an update goal with respect to a fixed 
database state is defined in terms of update request sets which contain insertion 
and deletion requests for ground EDB tuples. This semantics is declarative and 
does not state anything about a particular evaluation model. 

In this paper we present an operational model that implements a fragment 
of the logical semantics in terms of open nested transactions [12,14] performed 
on top of a loosely coupled database system (DBS). The structure of complex 
operations is reflected by trees of subtransactions during the execution, and the 
semantics of hypothetical states is implemented using backtrackable immediate 
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updates. This allows us to handle efficiently also non-deterministic update spe- 
cifications that are possible in ULTRA and other rule-based update languages. 
Although non-deterministic updates are not commonplace in database systems, 
they are a valuable instrument when it comes to novel applications like e.g. the 
combination of database operations with external actions. For instance, an ap- 
plication that does simulation or planning for a robot needs to state only the 
destination of a movement without having to care about the exact path that is 
taken. 

Our operational model features immediate updates in combination with back- 
tracking, recovery by compensation, and nested transactions. Exploitation of 
additional semantical knowledge of complex operations, which correspond to 
subtransactions, for concurrency control and recovery is a key feature of the 
model. In contrast to the work presented in [17], which operated strictly at 
the level of basic operations and consequently used nested transactions only as 
rollback spheres, the additional benefits of nested transactions, like handling of 
long-running transactions, increased possibilities of concurrency, or more effi- 
cient compensation of complex operations are incorporated in the new model as 
well. The sequential program fragment and the execution strategy discussed in 
this paper, for instance, will lead to more inter-transaction concurrency with the 
possibility to soften the ACID properties [3] . Other strategies featuring parallel 
subtransactions may also profit with respect to intra-transaction concurrency. 
However, this is still a point of ongoing work. In the open nested transaction 
model it must be possible to issue a compensating subtransaction [10] for every 
committed subtransaction. These compensating (undo) transactions can be spe- 
cified by the same means, i.e. using ULTRA rules, as the forward (do) operations. 
We identify the components that are necessary for the transactional execution 
of logical update queries and give a protocol for the interaction between logical 
evaluation and a transaction scheduler. It turns out that the logical evaluation 
is independent from operational aspects like the actual scheduling protocol, so 
we can use the results from database theory here and do not have to invent e.g. 
special scheduling protocols. The only requirement is that the scheduling proto- 
col corresponds to the selected transaction model, i.e., in our case, open nested 
transactions. The key components of our model ~ logical evaluation, transaction 
scheduling, and external DBS - can be selected and tuned essentially indepen- 
dently from each other according to the requirements of the application. 

An important aspect in the treatment of open nested transactions is the meta 
information that is needed to schedule complex operations and to compensate 
already committed subtransactions. Although the meta information, i.e. the data 
about the compatibility or conflict of complex operations and the data about 
how a complex operation can be compensated, is essential for a practical system, 
it has not received much attention yet. Compatibility information is usually 
assumed to exist in the form of a compatibility relation, but details on how 
this relation can be obtained and what must be taken into consideration are 
rarely addressed. The same applies to compensation, where little if anything is 
said about the internal structure of compensating actions, about how to pass 
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parameters, etc. We sketch a method to declare compatibility information and 
show how it can be used for an actual compatibility test. We further discuss the 
central issues concerning compensation in our framework. 

In the literature, various other logic- or rule-based update languages are pre- 
sented, among them Chen’s approach [6], Transaction Logic [4,5], and U-Datalog 
[13]. These approaches mainly deal with specification of updates, while execu- 
tion is not in the main scope of investigation. In this paper, we apply results 
from database theory to ULTRA as one representative of such update languages. 
We provide a mapping between complex operations specified by logical rules 
and nested transactions and use nested transactions not only for backtracking, 
but also to increase concurrency. Furthermore, we allow high-level compensation 
operations, which are again specified in the logic language. As the components 
of our architecture can be adapted to various evaluation strategies and sche- 
duling protocols, we are sure that the techniques described in this paper can 
also be applied to other rule-based update languages, if these are extended to 
provide the necessary additional meta information. We are also convinced that 
our model can be adopted for general update programs that can be combined in 
a modular fashion, for instance stored procedures in SQL [7]. The evaluation of 
stored procedures is simpler than the evaluation of update rules, but the tech- 
nical results about scheduling and the declaration of meta information apply to 
both languages. 

HiPAC [8], an active object-oriented database management system, also uses 
nested transactions for the execution of EGA rules. In contrast to the model de- 
scribed in this paper, the nested transactions of HiPAC only reflect the internal 
structure of the triggered rules and are not used to increase the possible amount 
of concurrency. To accomplish the requirements of the active component, e.g. 
coupling modes between rules, the transaction model is extended with special 
features like “deferred subtransactions” or “nested top transactions”. Yet, the 
semantics of HiPAC is operationally defined based on this extended nested tran- 
saction model, while our approach combines the purely declarative semantics 
of ULTRA with an open-nested-transaction-based operational model. Moreover, 
the paradigm behind EGA rules is inherently different from that behind ULTRA 
rules. 

The rest of the paper is organized as follows: In Section 2 we recall the key 
elements of the sequential fragment of the ULTRA language. We develop and 
discuss a new operational model in Section 3. Section 4 presents some central 
points concerning meta information, before Section 5 summarizes the paper and 
collects some research issues to be addressed in the future. 

The work described in this paper has been funded by the German Research 
Agency (DEG) under contract number Fr 1021/3-1. 



2 The ULTRA Language 



ULTRA extends the syntax known from Datalog [11] by basic update atoms, 
which can be used in rule bodies. The predicates defined by rules correspond to 
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complex update operations on the extensional database (EDB), whereas in pure 
Datalog they correspond to views. Presently, the technical descriptions of ULTRA 
are restricted to insertions INS r{ti, ■ ■ ■ ,tk) and deletions DEL r{ti, ■ ■ ■ ,tk) of 
EDB tuples as basic update atoms, although other basic operations can smoothly 
be integrated into the ULTRA concept. To compose multiple operations, ULTRA 
introduces the new connective (sequential conjunction) besides a concurrent 
conjunction derived from the traditional conjunction used in Datalog ru- 
les. While two concurrently composed subgoals are evaluated in the same state, 
sequential conjunction means that the subgoal on the right refers to an inter- 
mediate state that results from the execution of updates specified by the left 
subgoal. Consequently, the sequential conjunction is associative but not commu- 
tative. The general form of sequential rules in the ULTRA language is 

where p{X) and each qi(Yi),l < i < n, is an atom as known from Datalog 
or a basic update atom as introduced above. An update program is a set of 
rules that specify a collection of complex operations in a modular fashion. In 
analogy to Datalog, recursive update programs are allowed and have a well- 
defined interpretation. A detailed description of the ULTRA syntax and semantics 
can be found in [19]. 

In the top-down- and tuple-oriented operational semantics referred to in this 
paper, we restrict ourselves to sequentially connected subgoals, i.e. rules of the 
form shown above; the extension to the full language, including concurrent con- 
junction and especially a bulk quantifier for the specification of set-oriented 
updates, is subject of our present research (cf. Section 5). 

Example 1 (Bank Transfer). Here we restate the standard example of how to 
specify the transfer of money from one bank account to another. This classical 
transaction example is suitable to illustrate the new features of the operational 
model described below. Again, we consider only insertion and deletion as basic 
update operations. Read access to the database is specifiable by EDB or IDB 
atoms, but in order to keep the example short we avoid IDB relations (views) 
expressed by deductive rules. In addition to the classical syntax, we allow simple 
constraints to express arithmetic computations. 

The update program, i.e. the program written in the ULTRA language, con- 
tains the update rules 

transfer{Amo,Aci,Ac 2 )^withdraw{Amo,Aci) : deposit{Amo, Ac 2 ) 

withdraw {Amo, Ac) account{Ac, Bal) : Amo < Bal : 

DEL account{Ac, Bal) : 

Bal' = Bal — Amo : INS account{Ac, Bal') 

deposit{Amo, Ac) ^ account{Ac, Bal) : DEL account{Ac, Bal) : 

Bal' = Bal + Amo : INS account{Ac, Bal') 

which specify a complex operation transfer built from (complex) sub-operations 
withdraw and deposit. The latter operate on the EDB which consists of one 
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binary relation account{N umber, Balance). A complex operation is considered 
as successful, if all sub-operations in one of its defining rules are successful. As 
we will see below, it makes sense to handle sub-operations as subtransactions 
that may even be committed independently from the top-level transaction. This 
leads to an open nested transaction hierarchy [16]. 

In the ULTRA system, a transaction is invoked by a top-level update goal which 
is submitted as a query. Evaluation has to be performed as a (top-level) transac- 
tion to guarantee the ACID properties [2,3]. In particular, it must be ensured 
that the data accessed by a transaction is not affected simultaneously by other 
transactions (isolation) and that either all changes caused by the transaction 
are applied to the database, or none of them at all (atomicity). To achieve this, 
the execution of all basic update requests as well as read access to the database 
has to be certified by the system. If the execution of an operation could possibly 
compromise the ACID properties, the operation cannot be certified, so the sy- 
stem may decide to delay it, to abort the whole transaction, or to do something 
else to resolve the violation of the ACID properties. 

The model-theoretic semantics of the insertion/deletion-oriented ULTRA lan- 
guage assigns update request sets to every successful update goal. These up- 
date request sets contain insertion requests -\-r(ti, . . . ,tk) and deletion requests 
—r(ti, . . . ,tk) for tuples of EDB relations. The solutions for a top-level query 
are called possible transitions, as they represent transitions from the given initial 
state to a desired final state of the transaction. Non-deterministic transactions 
may generate more than one possible transition. Note that the characterization 
of the possible transitions for an update query does not depend on a particular 
evaluation strategy. Before a transaction invoked by a query can be committed, 
one of the possible transitions must have been materialized, i.e. its update re- 
quests must have been executed and committed on the persistent EDB. However, 
the logical semantics does not restrict the choice of a possible transition or the 
time of materialization in any way. 

Example 2 (Bank Transfer (Cont.)). Recall Example 1 above and consider the 
query ip :=-<r- trans/er (1000, 88009, 88004). The (unique) possible transition for 
ip is encoded by the following update request set A, assuming that the accounts 
88004 and 88009 have a balance of $1000 and $5000, respectively: 

A = { —account (88004, 1000), — account(88009, 5000), 

-haccount (88004, 2000), -h account (88009, 4000) } 

Due to space limitations, we do not describe in detail how the possible transitions 
are constructed using the ULTRA semantics, but refer the reader to [18]. Note 
that the update request sets do not express an evaluation by subtransactions. 
They just express the resulting changes of the accounts. 

3 The Operational Model 

In the ULTRA concept, the execution of a transaction consists of two types of 
processing: the evaluation of a query (for binding variables and computing the 
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update requests) and the application of selected update requests to the EDB. In 
the architecture described in [19], execution was strictly divided into two subse- 
quent phases doing exclusively evaluation and application, respectively. Thus, in 
the first phase all possible transitions of the transaction goal had to be computed 
without changing the physical EDB instance, and in the second phase one pos- 
sible transition could be materialized. This corresponds to the logical semantics 
described in Section 2. However, an operational semantics completely based on 
deferred updates has several drawbacks (see [17] for a more detailed discussion): 

First, there is a need for hypothetical reasoning when referring to interme- 
diate states: as the operations leading to an intermediate state are known but 
not carried out yet, their effects on the state are not visible and thus must be 
computed by a reasoning component. An axiomatization of the observable effects 
is necessary to enable such hypothetical reasoning. Unfortunately, this is only 
tractable for simple basic operations like insertions and deletions. 

A second practical problem results from performing a transaction in two 
strictly separated phases (evaluation and materialization). Such a system does 
not show a continuous behaviour during the evaluation and thus is not suitable 
to be extended by e.g. interactive components. It merely implements a batch 
mode, where action requests are collected to be performed later. 

Finally, the standard bottom-up evaluation as proposed for the ULTRA se- 
mantics always computes all possible transitions in the evaluation phase. Es- 
pecially in presence of non-deterministic specifications and much hypothetical 
reasoning this may lead to a lot of unnecessary work, and even small examples 
may not be tractable anymore. 

To solve these problems we reuse the top-down left-to-right evaluation stra- 
tegy well-known from Prolog and apply it to sequential ULTRA programs. During 
evaluation, updates are not collected for later execution, but executed imme- 
diately using database techniques. Evaluating queries in this top-down fashion 
results in a resolution tree, which can be mapped onto a nested transaction tree 
(see Fig. 1). This is a well-known fact and has been used e.g. in the model de- 
scribed in [17]. There, we show how subtransactions are used as rollback spheres 
to implement a backtracking that fits with the logical semantics. 



transfer(1 000,88009,88004) 




acc(88009,Bal) DEL acc(88009,Bal) INS acc(88009,Bar) acc(88004,Bal) DEL acc(88004,Bal) INS acc(88004,Bal') 

Fig. 1. Resolution tree corresponding to a nested transaction tree 



Yet, using subtransactions only to aid backtracking is not satisfying. Nested 
transactions were invented in the database community to, among other reasons, 
increase the possible amount of concurrency between transactions by using the 
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additional semantical knowledge of complex operations. This can of course also 
be done in the case of logical update languages as investigated in this paper. 
All that is needed is a scheduler for nested transactions and meta information 
about compatibility or conflict between the operations. The latter is discussed 
in Section 4. 



3.1 Execution of Update Queries 

In the following, we assume that P is an ULTRA program consisting of update 
rules as introduced in Section 2. 

Definition 1 (Transactional Update Query). A transactional update 
query has the form ^ 5 (ci, . . . , Ck), where g{c \, . . . , Ck) is an update atom. 

Note that the restriction of queries to only one atom is not a severe one as 
complex queries of the form ^ ffi(ci) : . . . : 5 „(c„) can be expressed using a 
new rule query ^ gi{ci) :...: g„(c„). 

Execution of transactional update queries is done by two components: a 
logical evaluation on the one hand, and a scheduler on the other hand. Seen as 
black boxes, the task of the logical evaluation is to take a transactional update 
query, evaluate it, and return success or failure. The task of the scheduler is to 
take operations (atomic goals) including special operations to begin, commit, 
or abort (sub-) transactions and to execute them atomically or reject them. If 
an operation is executed, its outcome (success or failure) and additional results 
(variable bindings for read operations) are returned. If it is rejected, failure is 
returned. 

With these two components, transactional update queries can be executed 
as follows: 

Definition 2 (Execution of Transactional Update Queries). 

Logical evaluation of query ^ g{c): 

1.1 Resolve the goal g{c) against the update program P. This results in a set of 
(partially) instantiated rule bodies for g. 

1.2 Choose one of the rule bodies, say gi{Xi) : ... : gn{Xn). 

1.3 Send a begin- of -transaction request to the scheduler. 

I.) For each 1 <i <n, take the instantiated subgoal gi{ci) 

— Send the goal gi{ci) to the scheduler and wait for the result. 

— If the scheduler returns success and exactly one result, use this result 
to obtain an answer substitution 9i, apply it to the subsequent subgoals 
gi+i, ■ . ■ ,gn and continue the loop with the next subgoal. 

— If the scheduler returns success and more than one result, create a new 
choice-point. Begin a new subtransaction by sending the corresponding 
request to the scheduler, choose one of the results to obtain an answer 
substitution 0i, and continue the loop with the next subgoal. 
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— If the scheduler returns failure but there exist unused results from a pre- 
vious choice-point, force a rollback of the current subtransaction, i.e. the 
subtransaction created as described before. Also undo the application of 
the current substitution 0i, choose another result to obtain a new sub- 
stitution 9i, apply it to the subgoals gi+i, . . . , gn and continue the loop 
at point 1.4. A choice-point and a new subtransaction have to be created 
again if other untried choices are still left. 

— If the scheduler returns failure and there is no possibility to use other 
variable bindings (i.e. all possible choices for 9i failed), react on that 
failure by backtracking: discard the variable bindings obtained, send an 
abort request relative to the current (sub-) transaction to the scheduler, 
and either retry the current operation (i.e. continue at point 1.3), choose 
another possible rule body (i.e. continue at point 1.2), or return failure. 
1.5 If all subgoals have succeeded, send a commit request to the scheduler. If 
this commit is acknowledged, return success. If the commit is rejected by the 
scheduler, return failure. 

Scheduling the execution of request g{c): 

II. 1 If a begin- of -transaction request is received, start a new (sub-) transaction. 

II. 2 If a commit request is received, test according to the scheduling protocol if this 
commit is possible. If so, return success as soon as the commit is executed. 
Otherwise, abort the current (sub-) transaction and return failure. 

11. 3 If an abort request is received, abort the current (sub-) transaction and report 
completion of the abort. See Section 3.3 for details. 

11. 4 If the requested operation g{c) is a basic operation, schedule it according 
to the concurrency control protocol. If the operation may be executed, send 
it to the data manager for execution. Return the outcome-value obtained 
from the data manager together with additional results. Return failure if the 
data manager refused the operation or it was not allowed by the scheduling 
protocol. 

11. 5 If the requested operation g{c) is a complex operation, schedule it according 
to the concurrency control protocol. If the operation may be executed, send to 
the logical evaluation the transactional update query g{c) and return the 
outcome of this query (success/ failure). Return failure if it was not allowed 
by the scheduling protocol. 

The scheduler has to record all actions taken together with their outcome in the 
persistent system log. 

During logical evaluation, non-deterministic choice is necessary at points 1.2 
and 1.4. These choices can possibly lead to failure in a goal evaluated later, as 
committing the subtransaction also commits the choice. We assume here that 
this is handled by the least common ancestor in the transaction tree of the 
operation which did the choice and the one which failed later, e.g. by retrying 
the evaluation with other choices. 

The execution method described above can also deal with recursive programs. 
On the side of the logical evaluation, handling of recursion is a well-known issue. 
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On the transactional side, recursion does not pose new problems, as what the 
scheduler encounters is only the unfolded recursion. So the only requirement for 
the scheduler is that it can handle arbitrarily deep nested transactions. Note 
that especially conflicts between different recursively nested levels cannot occur, 
as in nested transactions there are no conflicts between a subtransaction and 
its ancestors. However, termination of recursive programs is not guaranteed in 
general: the behaviour depends on properties of the update program as well as on 
choice strategies during the evaluation. Yet, the semantics of a non-terminating 
program is undefined in general, and not a special problem introduced by our 
execution method. 

The problem that a non-terminating recursive unfolding will never yield a 
consistent database state could be tackled by introducing a depth limit for the 
nested transactions. If this limit is reached, the creation of the next subtransac- 
tion will fail and backtracking will be enforced. Consequently, infinite branches 
in the transaction hierarchy are excluded. Note that such a solution is purely 
operational and destroys the theoretical universality of the recursion, having con- 
sequences also for non-recursive programs. From the practical point of view, the 
limit value should be high enough to support the (sequential) implementation 
of set-oriented updates by recursive rules. 

Note that there may be various threads doing logical evaluation in the sy- 
stem, which may also belong to different top-level transactions that are concur- 
rently executed. Yet, there is only one central scheduler, which ensures that the 
interleaved execution of concurrent transactions is correct, i.e. serializable. 

Proposition 1. Execution of transactional update queries as described in Defi- 
nition 2 performs the updates that are required by the logical semantics o/ ULTRA. 
Proof (sketch): For every update query, the ULTRA semantics yields one or 
more update request sets that capture the logical meaning of the query. To execute 
the query, one of the sets has to be chosen and materialized. 

Our operational model executes the updates immediately, so, if there is no 
logical failure during the evaluation, the “sum” of all the immediate updates 
corresponds to one of the update request sets of the ULTRA semantics. If, on the 
other hand, a logical failure occurs, all updates done on the failing branch so far 
are removed from the database by aborting the corresponding subtransaction. So, 
failing branches cause no updates that could compromise correctness with respect 
to the logical semantics. 

Finally, during evaluation there are several points where non- deterministic 
choices are made (see 1.2 and I.) in Definition 2). If the choice results in a 
failing branch, its updates are removed. If, on the other hand, the choice leads to 
a successful branch, this corresponds to one possible transition due to the ULTRA 
semantics. The only difference is that the decision to materialize this branch and 
not another one would have been delayed at the semantical level until all possible 
update request sets are known, while the operational model anticipated the choice. 

The execution of the transfer query p of Example 2 is shown in Fig. 2 as a 
time-line diagram. 
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Fig. 2. Successful execution of a transfer query 



Execution of queries in this operational model allows us to exploit concur- 
rency between transactions not only at the level of basic operations (point II. 4 
in Definition 2), but also between high-level operations (point II. 5). Thus, our 
execution model goes beyond the simple use of subtransactions for backtracking. 

Proposition 2. Interleaved executions of transactional update queries according 
to Definition 2 obey the ACID properties. 
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Proof (sketch): The execution of every (basic or complex) operation is con- 
trolled by the scheduler. So, if the scheduler uses a concurrency control protocol 
that guarantees serializability, this property is extended to the execution of tran- 
sactional update queries, too. 

Note that we do not prescribe a certain scheduling protocol or method. We only 
require a protocol that can handle nested transactions and guarantees serializa- 
bility. Whether this is done by locking, optimistic approaches, or something else 
is left to the scheduler. 

An implication of this proposition is that every transaction in the system 
must be under the control of the ULTRA scheduler. Local transactions that ope- 
rate directly on the database can destroy serializability of the interleaved exe- 
cutions. Therefore, we assume that no local transactions are executed. However, 
techniques developed for federated or multidatabase systems could be adopted 
here. 

3.2 Compensation 

Compensation is the key technique to enable recovery of open nested transac- 
tions. For every do operation that makes certain changes to the database, a 
corresponding undo operation has to be provided that removes the changes that 
have been committed independently of the top-level transaction. As other tran- 
sactions may already have read these changes and may have made their own 
modifications, recovery by simply restoring the before images is not adequate. 
Instead, the changes to be removed are semantically undone by a compensating 
operation. Of course, the conventional restoration of a before image can be seen 
as a compensation method for simple basic operations. 

In the extended ULTRA system, every operation must also have a correspon- 
ding undo operation. For basic operations, the undo actions can be assumed to be 
already implemented in the data manager, such that the scheduler can call them 
directly. Complex operations mostly can be compensated only by other complex 
operations. So, the scheduler must be informed which operation it must invoke 
(see also Section 4.1). 

Undo operations must in general be provided by the programmer who wrote 
the corresponding do operation, as she knows its exact semantics and what is 
necessary to undo its effects. Although it would be possible for the system to ge- 
nerate compensation steps automatically from the structure of the compensated 
operation and the current database and log state, in essence this would lead to 
compensation only at the level of basic operations. Moreover, the compensating 
action may need some additional parameters that depend on the internal states 
of the forward action and that must be recorded by the log. In this paper we 
restricted ourselves to complex operations which can be compensated without 
providing additional parameters, i.e. all the information to compensate a com- 
plex operation g{c) is contained in the arguments c. Note that this property 
does not hold for the basic insertions and deletions in ULTRA: to undo an ins- 
ert operation, for instance, the scheduler must know whether the insertion was 
proper or not, i.e. if the tuple was already contained in the database before. 
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The crucial point with compensation is that the subtransaction corresponding 
to the undo operation must not fail. First, this is a requirement to the scheduler: 
it must not abort a compensating operation due to transactional conflicts. This 
problem can be solved with adequate scheduling protocols which allow only hi- 
stories that are prefix reducible, for example based on [9,15], or by a simple retry 
of the aborted subtransaction. But it is also a requirement to the logical evalua- 
tion and the compensating operation itself: If the evaluation of the compensating 
goal fails logically, i.e. due to the logical semantics, the recovery is (currently) not 
possible. The programmer must be made responsible to provide compensating 
actions that always have a logical solution. Otherwise the system may remain in 
an uncertain state until recovery becomes possible or is done manually. 

Definition 3 (Compensation (acc. to [10])). An operation p~^{c') compen- 
sates another operation p{c) , iff p{c) : p~^{c') is the identity mapping, i.e. 
from a semantical point of view a null operation. 

3.3 Handling Logical and Transactional Failure 

As shown in [17], logical failure corresponds to an abort of a subtransaction. 
Because changes made by committed subtransactions are visible to other tran- 
sactions running concurrently in the open nested transaction model, we use com- 
pensation as described above to undo changes of committed subtransactions. 

Definition 4 (Abort of Transactions). To abort a (sub-) transaction, the 
scheduler proceeds as follows: 

1. Record the beginning of the compensation in the log. 

2. Consult the log and obtain all successful operations gi{ci ), . . . , gn{cn) within 
the current (sub-) transaction that have been executed. 

3. Compensate the operations gi{ci),l < i < n, in the reverse order, i.e. star- 
ting with operation gn{cn) and proceeding until gi{ci), as follows: 

4 . If gi{ci) is a basic operation, issue the corresponding compensating operation 
gf^{cC) to the data manager. 

5. If gi{ci) is a complex operation, send the corresponding compensating ope- 
ration gf^{cC) to the logical evaluation as a transactional update query <— 

9T\<). 

6. Record the execution of the compensating actions in the log, as well as the 
completion (commit) of the compensation when all compensating operations 
have been executed. 

An unsuccessful execution of the transfer query of Example 2 is shown in Fig. 
3. The abort of subtransaction 1.1 requires the compensation of the already 
committed withdraw operation by a corresponding deposit operation. 

A transaction which is aborted by compensation gets physically committed 
after all the compensating operations have been executed. This is necessary 
because changes of the do operations are undone by explicit compensation and 
not by simply restoring an old state. 




Logical Update Queries as Open Nested Transactions 



57 



Evaluation Threads 



Scheduler 



-transfer(...) 



resolve 

withdralti 



transfer(...) to 
(...) : deposit(...) 



1.1 withdraw(...) 



ret olve 
a( !X)un1 



luate constraint 



luate constraint 



1.1 withdraw(...) 



withdraw(...) to 

...,Bal) : ... : DEL ... ... : INS ... 

1.1 BOT 



1 . 1.1 



1 . 1.1 account(...,Bal) 



1.1.1 commit 



1.1 deposit(...) 



1.1 abort 



1.1 deposit(...) 



compensate 



resolve 

accoun 


deposit(...) to 
:...,Bal) : DEL ... 

1.1 


. : INS ... 
BOT 






1.1.2 




1.1.2 


accounW. 


..Ball 




Bal=4000 




1.1.2 


DEL ... 






ok 


evaluate 


constraint 








1.1.2 


INS ... 






ok 




1.1.2 


commit 






commit ok 




success 


abort 


ok 


1 abort 


abort 


ok 







withdraw(...) 



I Data Manager ~| 



account(...,Bal) 



DEL ... 



INS ... 



account(...,Bal) 



DEL ... 



INS ... 



Fig. 3. Failing execution of a transfer query 



To allow for the compensation of complex operations (labeled gi in Defini- 
tion 4 above), the corresponding high-level compensating operations (called g~^ 
above) must be provided by the programmer. Compensating operations can be 
specified by update rules just like normal complex operations. 

It is important to stress the similarities between backtracking on the logical 
side and compensation on the transactional side. Yet, compensation with com- 
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plex, high-level operations introduces a new dimension since it can be seen as a 
kind of high-level backtracking: instead of going back each single step that was 
done with the forward operations, compensation allows to “jump back” several 
steps at once. This can of course also be exploited to optimize the reasoning of 
a rule-based component within a larger system architecture with this kind of 
intelligent backtracking. 



transfer(1000, 88009, 88004) 




acc DEL INS aoo(88009,Bal) DEL aoo(88009,Bal) INS acc(88009,Bal') 

Fig. 4. Recovery by compensation 



The transaction hierarchy depicted in Fig. 4 shows several examples of com- 
pensation. The failure in the deposit operation causes the execution of the com- 
pensating operations for the already executed operations deleting a tuple and 
reading the database. The deposit subtransaction is committed after compensa- 
tion, and the logical abort is reported to the transfer operation. If also transfer 
decides to fail, i.e. to abort its transaction, the already committed withdraw ope- 
ration must be compensated, too. This is done by executing a deposit operation 
(see Example 3 in Section 4.1 for a discussion about compensation in our running 
example). 

Note that the use of compensation interferes with the argumentation of Pro- 
position 1. There, the proof is based on the fact that failing branches do not cause 
any updates on the database. This is guaranteed operationally by aborting the 
subtransaction corresponding to that failing branch. Now, using compensation 
failing branches do in fact cause updates which even get committed, but from the 
semantical point of view, the combined effect of all theses updates is null. Nevert- 
heless, at the operational level there are updates which are not contained in any 
of the update request sets defined by the model-theoretic semantics of ULTRA. 
To be able to handle this formally we are currently developing a model where the 
update request objects are extended from simple sets to structured multi-sets 
resembling a kind of log. This new model will facilitate reasoning about equi- 
valence of logs and therefore will be suitable to also capture the compensation 
semantics. 



3.4 Architecture 

The operational model described in this section can be realized in a system 
architecture as depicted in Fig. 5. This reflects also the actual architecture of 
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our prototype system, which is implemented in the Java programming language 
on top of the relational database management system ORACLE. We used the 
prototype system to verify that our model is also suitable for the transactional 
execution of update queries over wide area networks like the internet. Further, 
we let the data manager also communicate with an external device (a virtual 
robot arm) instead of a database system, demonstrating the universality of the 
ULTRA approach. 



Transactional Update Query 




Fig. 5. The new ULTRA architecture 



The responsibilities of the various components of our architecture are as fol- 
lows: The external database management system is used as a persistent storage 
for EDB data as well as for logging purposes. The data manager executes all 
basic operations of the ULTRA program. The ULTRA scheduler is responsible 
for concurrency control and recovery of complex as well as basic operations. It 
receives requests to execute an operation, schedules them according to a nested 
transaction scheduling protocol, and “executes” the operation: basic operations 
are passed to the data manager, complex operations are forwarded to a newly 
created thread of the ULTRA evaluation. If compensation is necessary during re- 
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covery, the corresponding operations are generated and sent to the data manager 
or the evaluation, too. The ULTRA evaluation gets queries from the user or the 
scheduler. These queries are partially resolved, and the resulting subgoals are 
sent to the scheduler for execution. The interaction protocols of the scheduler 
and the evaluation have been described in detail in Sections 3.1 and 3.3. 

4 Meta Information 

As explained in the previous sections, the “pure” logic program, i.e. the ULTRA 
rules, must be enriched with additional information. On the one hand, com- 
patibility information is needed to allow correct scheduling, on the other hand 
compensation information is needed for recovery. 



4.1 Compensation 

To enable the scheduler to do compensation as described in Sections 3.2 and 3.3, 
it must first be provided with the compensating operations per se, and second it 
must be informed about which do operation is undone by which undo operation. 
These two aspects of additional information are treated in the following. 

Example 3 (Bank Transfer (Cont.)). As an example, let us specify the compen- 
sating rules for the operations of Example 1. 

Compensating a withdraw is the easiest case. To undo the debit, the amo- 
unt of money is simply credited back on the account. So the compensation for 
withdraw{Amo, Ac) is deposit{Amo, Ac). This could be declared in a meta pro- 
gram by the following non-ground fact: 

undo( withdraw{Amo, Ac), deposit{Amo, Ac ) ) 

Undoing a deposit operation needs some deeper thoughts. Although one is 
tempted to say that, in analogy to the above, a deposit can be compensated by 
a withdraw, this can lead to problems. Recall that a withdraw operation checks 
for an overdraft on the account and fails if the balance would become negative. 
So compensation would fail logically, which must be avoided. To do so, there 
are two possibilities: First, the responsibility can be delegated to the scheduler, 
which must then ensure that the withdraw compensating a deposit does not 
fail logically. This essentially means that no withdraw operation from another 
transaction must have been executed between the deposit and its compensating 
withdraw - obviously a severe restriction. The alternative would be to provide 
a compensating operation that cannot fail logically, i.e. something like 

undo -deposit{ Amo, Ac) account{Ac, Bal) : DEL account{Ac, Bal) : 

Bal' = Bal — Amo : INS account{Ac, Bal') 

and a declaration that deposit is undone by undo-deposit. 

undo( deposit{Amo, Ac), undo Aeposit{ Amo, Ac ) ) 
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But note that now an account may get a negative balance, which usually is 
forbidden. The decision whether this is nevertheless tolerable is a question of 
the bank’s business policy. You could also imagine the policy that the balance 
on an account may become negative due to compensation, but that a human 
operator is notified if the overdraft accounts for more than, say, $5000. 

Finally, the same considerations as above can be applied to the undo ope- 
ration of a transfer. Again, it is either forbidden to withdraw from the target 
account of the transfer until the transaction finished (either successfully or with 
its compensation), or it is accepted that the account’s balance may become ne- 
gative. In the latter case, a compensation rule 

undojtransfer{Amo, Aci , AC 2 ) ^ undo -deposit{ Amo, AC 2 ) : 

deposit{Amo, Aci) 

can be given along with the declaration 

undo( transfer{Amo, Aci,Ac 2 ), undo -transfer {Amo, Aci,Ac 2 ) ). 

The compensation rule undoAransfer from Example 3 is a nice representative of 
an undo operation that also could have been generated automatically. In general, 
for deterministic rules, i.e. rules that do not require non-deterministic choice in 
their evaluation (see 1.2 and 1.4 in Definition 2), compensating operations can 
be generated automatically: 

Proposition 3 (Generation of Compensation Rules). Let a deterministic 
rule p{X) ^ gi{Xi) : . . . : gn{Xn) he given. Then the rule 

p-\X) ^ g-\Xr,) : ... : gf\X^) 

where g~^, 1 < i < n, is the compensating operation of gi, defines a compensating 
operation for p. 

Proof (sketch): Leaving out the arguments, it obviously is the case that 

P ■■ p~^ = gi : . . . ■ gn -1 ■■ On ■■ 9n^ ■■ 9~\ ■....■. 9f^ 

From the preconditions follows that g~^ compensates gn (for all parameter in- 
stances). Therefore, the pair g„ ■ g„^ is a null operation and thus can he removed, 
leaving gn-i '■ 9n-i next to each other. Now the same argument can he applied 
inductively, until gi : gf^. As no non-deterministic choice is involved in the 
evaluation of p, no additional information is needed in compensation. So, p~^ 
as defined in the proposition is a compensating operation for p. 

Of course, in an actual execution the operations gi and g~^ may be interleaved 
with operations from other transactions. Yet, as the scheduler guarantees seria- 
lizability this interleaved execution is equivalent to a serial one, so that we may 
assume a serial execution as shown in the proof. 

Note, however, that this works only for simple cases. In the presence of non- 
determinism, e.g. if an operation is defined by several rules, the undo rules may 
need additional tests to determine the outcome of the non-deterministic choice. 
This may be done by analyzing the effects that have to be compensated or with 




62 



A. Pent, C.-A. Wichert, and B. Freitag 



special log entries reflecting the selected rule body. But in general rules like this 
cannot be generated automatically and therefore must be provided by a human 
programmer. In a transaction abort as described in Definition 4 the system log 
is analyzed backwards (point 2) and so it is known which operations have been 
executed and must consequently be undone. 

Another drawback of the automatic generation of undo rules is that the pro- 
grammer may have additional semantic knowledge which the system cannot use. 
Recall the discussion about the compensation for deposit in Example 3: The 
solution that an overdraft on the account may be acceptable, but that a human 
operator must be notified cannot be deduced automatically. Things like that 
require a human being who is familiar with the system, its environment, busin- 
ess policies, etc. Moreover, there are cases where several forward steps can be 
undone by only one undo operation, but this is not known to a straight-forward 
generator of compensating actions. A classical example for this is a do opera- 
tion that creates a new relation and inserts data into it. A human programmer 
knows that this can be undone by simply dropping the relation, while an au- 
tomatically generated compensating action would delete all the inserted tuples 
before dropping the relation. 

4.2 Compatibility Information 

The second element of additional information contained in the program is infor- 
mation about the compatibility or conflict of complex operations. Testing the 
compatibility of operations is an issue in most transaction scheduling proto- 
cols. All the widely used techniques, like serialization graph testing, locking, or 
time-stamp based methods are founded on a notion of conflict or compatibility 
between operations. This central role of compatibility information is emphasized 
in advanced transaction models like nested transactions [1]. 

As the scheduler in our ULTRA system does not only handle basic operations 
but also complex operations specified by a programmer, it must be provided with 
a kind of compatibility relation between all the operations, including compen- 
sating operations and pure retrieval operations (for EDB/IDB predicates). Note 
that the compatibility and conflict information is used by the scheduler to decide 
about the serializability of an interleaved execution of concurrent transactions. 
So, if two operations are specified to be compatible while in reality they are not, 
this leads to histories that are not serializable and thus incorrect. 

The needed compatibility relation can be given e.g. in the form of a com- 
patibility matrix which contains one row/column for each (basic or complex) 
operation. Yet, providing such a matrix is a strenuous task as it grows quadra- 
tically in the number of operations. Moreover, as the complex operations may 
operate on different levels of abstraction it is necessary to take the operations’ 
arguments into account as well. 

Example 4 (Bank Transfer (Cont.)). A compatibility matrix for our banking 
operations withdraw, deposit, and transfer would state that deposit and transfer 
are in conflict as a transfer operation may fail if it was executed before a deposit. 
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and succeed afterwards. Yet, the order between the two operations is only re- 
levant if the from-account of the transfer is affected by the deposit. If the 
transfer’s to-account is also credited by deposit, the order of the two operations 
is irrelevant. This can only be expressed when the arguments are considered, 
too. 

As declarations with arguments cannot be accomplished easily with compati- 
bility matrices we propose to use a simple language to specify the conditions 
under which operations are compatible or in conflict. This language should, on 
the one hand, have enough expressive power to declare the necessary conditions, 
typically (in)equalities of arguments or range checks. On the other hand, com- 
patibility tests must be carried out efficiently, so the declarations must not be 
too complex (i.e. no full-fledged language with recursion etc.) and the conditions 
must be expressed only on the arguments of the operations. In particular, they 
must not refer to the current content of the database, as this would require a 
database access during compatibility check. So we can define: 

Definition 5 (Compatibility Declaration). The general form of a compati- 
bility declaration for two (complex) operations p{X) and q(Y) is 

comp(p(X), q{Y)) ^ condition{X ,Y) 

where condition{X ,Y) is a boolean expression using at most the variables that 
appear as arguments of the predicates p and q. 

The set D of all declarations of this form is called the declaration set. 



Example 5 (Bank Transfer (Cont.)). Compatibility declarations for the predi- 
cates of our banking example are shown here. Note that the first arguments 
(the amounts of money) are irrelevant for the compatibility behaviour that only 
depends upon the accounts accessed. 



comp{ deposit{A, Ac), deposit{A' , Ac')) 
comp( withdraw{A, Ac), withdraw{A' , Ad ) ) 
comp( deposit{A, Ac), withdraw{A' , Ac ') ) 
comp( deposit{A, Ac), transfer{A', Ac(, Ac^ ) 
comp{ withdraw{A, Ac) , transfer{A',Ac'i,Ac' 2 )) 

A 



comp( transfer{A, Aci, AC2), transfer{A', Ac(, Ac^) ) 

A 

A 



true 

Ac yf Ac' 
Ac ^ Ac' 
Ac yf Ac'i 
Ac yf Ac'i 
Ac yf Ac 2 
Aci yf Ac'i 
Aci y^ Ac2 
Ac 2 yf Ac'i 



Then, to test a pair of operations for compatibility or conflict, the scheduler 
only has to evaluate the condition given in the declarations about the pair. If 
the condition evaluates to true, the operation pair is compatible. If it is false, 
or there is no appropriate declaration about the pair to be checked, the sche- 
duler must assume a conflict. Note that we allow at most one declaration for 
every pair of operations. As compatibility is a symmetric relation, a declaration 
comp(p(X), q{Y) ) implies the corresponding declaration for comp( g(Y), p{X) ). 
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Definition 6 (Compatibility Test). Two operations p{c) and q(c') are eom- 
patible aecording to a declaration set D, iff there exists a declaration 

comp(p(X), q{Y)) ^ condition{X ,Y) 



or 



comp( q{Y), p{X ) ) condition{X, Y) 
in D such that condition{c, c') evaluates to true. 

Scheduling protocols use the information about compatibility or conflict to en- 
sure that interleaved executions are correct, i.e. serializable. The protocols ge- 
nerate only interleaved executions where the ordering of conflicting operations 
corresponds to a serial execution - whether this is done using locks, time-stamps, 
or other techniques does not matter here. However, it should be noted that it is 
safe to assume a conflict between operations per default: 

Proposition 4. Considering missing declarations as an indication of conflict 
as done in Definition 6 will not compromise correctness of schedules generated 
by a scheduling protocol. 

Proof (sketch): If operations are considered to be in conflict the scheduling 
protocol only allows interleaved executions that order them according to a serial 
execution, i.e. some interleaved executions are considered non- serializable by the 
protocol and thus are not generated. Yet, the protocol still ensures that the ge- 
nerated schedules are serializable. In the worst case, if every pair of operations 
is considered to be in conflict, the resulting schedules would even be serial. Con- 
sequently, assuming a conflict if no information is given will not compromise 
correctness. 

Obviously, assuming a conflict may reduce the possible amount of concurrency 
that can be exploited by the protocol. Yet, to ensure correctness this must be 
tolerated. 

However, since undo operations must also be handled by the scheduler com- 
patibility declarations are necessary for them as well. As this again increases the 
number of potential declarations, adequate tool-support to aid the user would 
be desirable. 

5 Conclusion 

We presented an operational model for performing transactions specified in the 
ULTRA language. The model is based on open nested transactions and features 
exploitation of compatibility information and compensation at levels above the 
basic operations, which may enhance the performance in complex and distributed 
information systems. As far as we can see from literature, this is an innovative 
attempt to bring together logical specification languages and recent results in 
the field of transaction theory. We assume that the clear semantics of logical 
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languages and the operational aspects of transactions can create a firm basis for 
a useful transaction programming environment. 

Our current work includes the extension of the ideas presented above to the 
full ULTRA concept [19] (including concurrent conjunction and bulk updates) 
and their investigation at the theoretical level. In addition, the ULTRA seman- 
tics itself is extended to arbitrary basic operations. Although the scheduler and 
the transaction model already provide the desired flexibility, this requires some 
non-trivial generalizations at the semantical level of ULTRA. In parallel to the 
work at the semantics, we investigate how to specify and reason about the meta 
information which is essential to deal with the extended transaction model. Our 
objective is to provide tools that assist the user in the composition of the needed 
declarations. 
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Abstract. In this paper we introduce inheritance in deductive object 
databases and define an operator for hierarchically composing deductive 
objects with state evolution capabilities. Evolution of such objects mo- 
dels the expected transactional behavior while preserving many impor- 
tant features of deductive databases. Deductive objects can be organized 
in ISA schemas where each object may inherit or redefine the rules defi- 
ned in other objects. The resulting inheritance mechanism handles both 
the deductive and the update/transactional issues. Our framework ac- 
commodates several types of inheritance such as overriding, extension, 
and refinement. Besides presenting the language, this paper defines its 
semantics and provides a description of the interpreter for the language 
that has been implemented. 



1 Introduction 

Deductive and object-oriented databases have been the focus of intense research 
over the last years. The former extend the mathematical foundations of relational 
databases towards declarative rule databases. The latter provide the modularity 
and encapsulation mechanisms lacking in relational databases. It is not surprising 
that the area of deductive object-oriented databases has been influenced, among 
the others, from researches in the area of databases, logic programming, artificial 
intelligence, and software engineering. 

In our work we take the database point of view where (deductive) objects 
have the granularity of logical theories and extensional updates are expressed 
within the rule language to model methods. Gooperation among objects is sup- 
ported by message passing, extending the Datalog language, as specified in [5]. 
The aim of this paper is to extend such an approach to accommodate different 
types of inheritance among objects. Thus we define a language, called Obj“^- 
Datalog, that, in addition to the above notions, expresses simple inheritance, 
overriding, extension, and refinement. The resulting language supports two dif- 
ferent cooperation mechanisms among objects: message passing and inheritance. 
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An Obj“^-Datalog database, indeed, consists of a set of objects that, besides 
cooperating through message exchanges, may also inherit predicate definitions 
one from another. When an object is defined as a specialization of another ob- 
ject, it must contain all methods of the parent object, but it can change their 
implementations. For each method it can keep the implementation defined in the 
parent object, can totally change it, or can slightly modify it, either by extending 
or by refining it. The language we propose provides those different possibilities 
on a per-rule rather than on a per-predicate basis, thus achieving a broader set 
of modeling options and enhancing flexibility. 

We remark that our proposal focuses on inheritance mechanisms and on their 
use to provide a broad spectrum of modeling possibilities and to maximize code 
reuse. Thus, the language we consider is a very simple deductive object language 
with updates, from which we leave out all the features that are not relevant to 
the main stream of our investigation. In particular, the considered language does 
not support the notion of class, and inheritance relationships are defined among 
objects, which can thus be seen as prototypes [26]. These objects can be very 
useful to design methods and to verify their properties. The proposed approach 
can however be extended to any deductive object language providing the notion 
of class and inheritance relationships among classes. 

The language has a two step semantics. The first step computes the bindings 
and collects the updates that will be performed in an all-or-nothing style in the 
second step. The resulting semantics models the traditional query-answer process 
as well as the transactional behavior. The advantage of this semantics is to allow 
a smooth integration between the declarative rule language and the updates. 
Indeed, no control is introduced within rules even if updates are defined in rules. 

The paper is structured as follows. The language is presented in Section 
2 and its semantics is given in Section 3. Section 4 shows how the language 
is interpreted in the prototype that has been implemented, whereas Section 5 
compares our approach with related work. Finally, Section 6 concludes the work. 



2 Language 

The language we propose supports two fundamentally different cooperation me- 
chanisms among objects: message passing and inheritance. When an object o 
sends a message m to an object o' it asks o' to solve the goal m, thus the evalua- 
tion context is switched to o' . When, by contrast, an object o inherits a method 
m from an object o' , it simply means that the definition of m in o' is employed, 
but the context of evaluation is maintained to be the initial method receiver, that 
is, o. Thus, inheritance can be seen as message passing without changing self. 
Indeed, messages that cannot be answered using the receiver message protocol 
are forwarded to the parent without changing self; when the forwarded message 
is answered by executing a parent method, every subsequent message sent to self 
will be addressed to the receiver of the initial message. Hence, the context of eva- 
luation is maintained to be the initial message receiver. The following example 
illustrates the difference between the two cooperation mechanisms. 
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Example 1 . Consider an object obji whose state only consists of the fact k{a) 
without methods, and an object obj2 whose state only consists of the fact k{b) 
and whose only method m is defined by the rule m{X) •<— k{X). Consider first 
message passing. obj\ may ask o6j2 to evaluate the goal lm{X) (this can be 
accomplished by specifying in obji a rule m{X) t— obj2 '■ m{X)); the result of 
evaluating 7 m(X) in obji would he X = b since the evaluation is performed 
with respect to 0&J2 state. Consider now inheritance: if obji inherits method m 
from obj2 (this can be specified by stating obji ~< o6j2, since method m is not 
defined in obji) the result of evaluating 7 m{X) in obji would be X = a since 
the evaluation is performed with respect to obji state. O 

In the remainder of this section, we first introduce objects and the message 
passing mechanism, and then we discuss how objects can be combined through 
inheritance. 



2.1 Objects and Cooperation through Message Passing 

Each real-world entity is modeled by an object. Each object has an identifier (ob- 
ject identifier, shortly OID) and a state. The state of an object is represented by a 
set of attributes, characterizing the properties of the object. The state of the ob- 
ject is encapsulated, that is, can only be modified by invoking operations that are 
associated with the object. An object communicates with other objects through 
message exchanges. A message may contain a request to retrieve an object at- 
tribute or to modify its state. The use of object identifiers as possible predicate 
arguments allows the state of an object to contain a reference to another object, 
and thus to express aggregation (part-of) relationships among objects, in that 
the value of an object attribute may be the identifier of another object. 

In a conventional logic program, all facts and rules appearing in the program 
can be used in a deduction step. By contrast, in an Obj*"^-Datalog database, 
there exist several sets of rules and facts collected in different objects. Therefore, 
at each step only the facts and rules of a specific object can be used. As a conse- 
quence, a goal must be addressed to a specific object, and the refutation is exe- 
cuted by using only facts and rules belonging to that object, until a rule is found 
containing a labeled atom in its body. When such a labeled atom is found, the 
refutation process “moves” to the object specified by the OID labeling this atom. 

An object is modeled as a set of facts and rules, where the facts represent 
the attribute values of an object and the rules represent the methods. Methods 
are used to compute derived attributes or to perform operations that modify the 
object state. Rules may contain both action atoms and deduction atoms in their 
bodies. Action atoms represent the basic mechanism supporting object state 
evolutions. Moreover, rule bodies may contain (deductive) atoms labeled with 
OIDs. The meaning of a labeled atom is to require the refutation of the atom by 
using the facts and rules of the object, whose OID labels the atom. Therefore, 
labeled atoms are the basic mechanism supporting message exchanges among 
objects. The object to which the message is sent can be fixed at program defini- 
tion time (in which case the label is a (constant) OID) or can vary depending on 
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the value of some object properties (in which case the label is an object-denoting 
variable). 

The notion of object is formalized by the following definitions. We consider 
a many-sorted signature S = {So,Sy}, only containing constant symbols. So 
is the set of object identifiers, that is, the values used to denote objects, while 
Sy is the set of constant value symbols. The sets So and Sy are disjoint. We 
moreover consider a set of predicate symbols II partitioned, as in Datalog, in 
extensional predicate symbols 7T®, and intensional predicate symbols iTb Both 
7T® and 7T* are families of predicate symbols 77®/® = {IIw''}y,(zs* , where S* 
denotes the set of all possible strings of sorts. S' = {o, v} (object identifiers and 
values). We denote with II yy the set of predicate symbols 77^ U Illy. ^ family of 
disjoint sets of variable symbols for each sort V = {Vo,Vy} is considered. Terms 
in Term = {Termo,Termy} are defined as usual for each sort of our language: 
a term is either a constant or a variable. 

Definition 1. (Deduction Atom). A deduction atom is defined as the application 
of a predicate symbol to terms of the appropriate sorts, that is, if p € Uy,, 
n = length{w) and \/i,i = l...n,ti € Termo if w.i = o while ti G Termy if 
w.i = V, then p{t\, . . . ,t„) is a deduction atom, also denoted as p{t). □ 

Deduction atoms are partitioned in extensional deduction atoms, those built on 
predicates in 77®, and intensional deduction atoms, those built on predicates in 
77®. 

Update operations are expressed in our language (as in U-Datalog [24] and 
in CVC [25]), by action atoms in rule bodies. 

Definition 2. (Action Atom). An action atom is an extensional deduction atom 
prefixed by + ( denoting insertion) or — ( denoting deletion), that is, ifp{t \ , . . . , ty) 
is an extensional deduction atom, then +p{ti , . . . , 7„) and —p{ti , . . . , t„) are ac- 
tion atoms. □ 

Cooperation among objects in the database is supported in our language by 
labeled atoms. A labeled atom represents a request of evaluating the deduction 
atom in the object denoted by the label. Two different kinds of labeled atoms are 
provided. C-labeled atoms model a fixed cooperation among objects, while V- 
labeled ones model a cooperation depending on the value to which the variable in 
the label is bound. Thus, let p G 77^,, a G Sy,obji G Sy and O G Vy, then obj^ : 
p{a) is a c-labeled atom (which represents the atom p{a) in the context of the 
-fixed- object objfi), whereas O : p{a) is a v-labeled atom. Given a substitution 
d, assigning a value to O, O : p{a) represents the atom p{a) in the context of 
the object OD. 

Definition 3. (C-labeled Atom). Let objh G Sy be an object identifier and 
p(ti , . . . , t„) be a deduction atom, then objh ■ p{ti , . . . , t„) is a c-labeled atom. □ 

Definition 4. (V-labeled Atom). Let X € Vy be a variable denoting an object 
identifier and p(ti, . . . ,t„) be a deduction atom, then X : p(fi, . . . ,tn) is a v- 
labeled atom. □ 
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Having introduced all kinds of atoms that can be used in our language, we are 
now able to introduce the notion of rule. 

Definition 5. (Rule). A rule has the form 

H B^ 

where: 

— H is an intensional deduction atom; 

— U = Ui, ... ,Ui is a vector of action atoms, constituting the update part of 
the rule; 

— B = Bi, . . . ,Bu] is a vector of deduction atoms, constituting the unlaheled 
part of the condition, that is, of atoms referring the object where the rule is 
defined; 

— B‘^ = ohji : B[, ... , objz : B( is a vector of c-labeled atoms, that is, of atoms 
referring specific objects; 

— B'" = Xi : B”, . . . ,Xr : B” is a vector of v-labeled atoms, that is, atoms not 
referring specific objects; 

— Xi, . . . , Xr must appear as arguments of a deduction atom in B\, ... , B^. 

The update part (U ) and the condition part (B , B'^ , B"" ) cannot be both empty. 
H is referred to as head of the rule, while U, B, B‘^, B"" constitute the body of the 
rule. For a rule to be safe [9] all the variables in H and all the variables in 
must appear in the condition part of the rule (B, B'^, B~" ). □ 

We remark that the symbol in the rule bodies denotes logical conjunction, 
thus the order of atoms is irrelevant. 

Example 2. The following is an example of Obj“^-Datalog rule. 

k{X,Y) ^ -t{Z),+t{N),t{Z),p{X),ohj^ : r{Y,N),X : k{Y) <> 

An object objj, where objj € So is the object identifier, consists of an object 
state and a set of methods. The object state EDBj is a set of facts, that is, a set 
of ground extensional deduction atoms. The object state is a time-varying com- 
ponent, thus in the following we denote with EDBj the possible states of object 
objj, i.e. EDBj denotes the t-th state of object objj. Methods are expressed by 
rules. 

Definition 6. (Object). An object objj = {EDBj, IDBj) consists of an iden- 
tifier objj in So, of an extensional component EDBj, which is a set of ground 
extensional deduction atoms, called object state, and an intensional component 
IDBj, which is a set of rules as in Definition 5, expressing methods. □ 

Referring to Definition 5, we notice that action atoms cannot be labeled. Indeed, 
to ensure encapsulation, the updates can only refer to the object itself. Note that, 
as quite usual in the database field, we do not encapsulate object attributes with 
respect to queries. That is, the value of an object attribute can be queried from 
outside the object. Otherwise, forcing strict encapsulation, a number of trivial 
methods only returning attribute values should be written to be used in queries. 



^ This ensures that only ground updates are applied to the database. 
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2.2 Inheritance 

In this section we describe the capabilities of our language for structuring in- 
formation through specialization. An Obj“^-Datalog database consists of a set 
of objects that, besides cooperating through labeled atoms, may also inherit 
predicate definitions from each other. Whenever an object objj specializes ano- 
ther object obji, the features of obji are inherited by objj] objj may in turn 
add more features, or redefine some of the inherited features. The redefinition 
of an inherited feature means that objj contains a feature with the same name 
and different definition of a feature in obji ■ The redefinition is a form of conflict 
between the two objects. In Obj“^-Datalog, according to the object-oriented 
paradigm, we solve this type of conflict by giving precedence to the most spe- 
cific information; therefore, the definition of a feature given in an object always 
takes the precedence over a definition of the same feature given in any of the 
objects the given object inherits from. This type of approach is called overri- 
ding. The specialization relationship among objects impacts not only the object 
structures, but also the behavior specified by the objects. Given objects obji 
and objj, such that objj inherits from obji, objj must contain all the methods of 
obji, but it can change their implementations. For each method, objj can keep 
obji implementation, can totally change it, or can slightly modify it. Consider 
a predicate p defined by one or more rules in obji, the following modeling eases 
may arise: 

1. simple inheritance 

objj does not define predicate p; therefore, objj inherits p from obji] 

2. overriding 

objj redefines predicate p, thus overriding the definition of p provided by 
obji] 

3. extension 

objj extends the definition of p provided by obji, so that p in objj is defined 
by a set of clauses which is the union among the clauses for p in obji and 
the clauses for p in objj ; 

4. refinement 

objj refines the definition of p provided by objp, p results therefore to be 
defined in objj by a clause whose body is the conjunction of the bodies of 
clauses for p in obji and in objj, with the heads properly unified. 

Note that object objj may provide additional predicates with respect to obji 
ones by defining predicates which are not defined in obji. 

The above modeling possibilities offer a broad spectrum of reusing moda- 
lities to designers. In addition to single inheritance and overriding, that are 
usual in the object-oriented context, indeed, we support extension and refine- 
ment which offer novel and useful opportunities to refine object behavior. Both 
correspond to the idea of behavioral subtyping [18], which can be achieved in 
object-oriented programming languages by exploiting super calls or the inner 
mechanism. In particular, extension allows to handle additional cases specific 
to the inheriting object through the addition of clauses, whereas refinement 
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allows to specialize the behavior by adding some conditions or actions. Note 
that, in some way, extension can be thought as a sort of contravariant behavior 
suptyping whereas refinement can be thought as a sort of covariant behavior 
subtyping^ . 

The following example motivates the usefulness of the modeling possibilities 
above. 

Example 3. Consider two objects objperson and objstudent, such that objstudent 
inherits from objperson- The following are examples corresponding to each of the 
modeling cases above. 

1. Simple inheritance: The rule defining a method to evaluate the age of a 
person, given his birth date, is the same for objperson and objstudent- 

2. Overriding: The predicate young of objperson, returning True if the person 
is considered young, most likely will be redefined in objstudent- Indeed, the 
criteria for determining when a person is young is probably different from 
the criteria used for student. 

3. Extension: Consider a predicate intelligent of objperson, returning True if 
the IQ of the person is greater than a given limit. Suppose that objstudent 
contains the score that a student receives on a given test. Moreover, suppose 
that a student is considered intelligent if either: (i) his IQ is greater than 
the given limit (the same for objperson)', or (ii) his score in the test is greater 
than a given limit. The predicate intelligent in objstudent then results in 
being defined by two different rules. 

4. Refinement: Consider a predicate that assigns a null value to all the facts 

in an object. The predicate null in objstudent will likely refine the predicate 
null defined in objperson, since the former should contain update atoms for 
all facts added in objstudent- O' 

We remark that our goal is to support the above modeling possibilities on a 
per-rule rather than on a per-predicate basis, thus achieving a broader set of 
modeling options. Indeed, an object may retain a clause of a predicate definition 
from an object it inherits from, yet hiding or refining other clauses of that predi- 
cate definition. To support those modeling possibilities, a mechanism is needed 
to refer specific rules in an object. Labeled rules are then introduced. A labeled 
rule has the form 

l,r ■ heada ^ BODYa 

where lx & E with C denumerable set of labels. All labels in a given object must 
be distinct. C{obji) denotes the set of labels in object obji. 

The meaning of labeled rules can be explained as follows. Consider objects 
obji and objj, such that objj inherits from obji, and suppose that Q € L{obji), 

^ Note that we do not address the issue of covariant-contravariant method (signature) 
refinement in the paper, since we do not consider typed variables nor signature 
definitions for our methods. 
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ly £ £{objj)] then given the labeled rules 

lx ■ heada ^ BODYa rule of ohji 
ly : headb £- BODY), rule of objj 

consider the following cases: 

lx — ly 

Then, rule headb £- BODYb of objj overrides rule heada £- BODYa of obji; 
the latter is hidden in objj. It is not possible to hide a predicate without 
redefining it. Therefore, if lx = ly, heada = headb, that is, the heads of 
the two clauses must be the same. Thus, a rule defining a predicate p in an 
object obj may only hide a rule defining the same predicate p in the object 
from which obj inherits. 

lx ^ ly 

Then, objj inherits rule heada BODYa from obji. Therefore, both the 
rule heada ^ BODYa and the rule headb ^ BODYb can be exploited to 
evaluate a goal in objj. 

Therefore, by labeling a rule with the label of a rule of the parent object over- 
riding (modeling case 2 above) is realized, whereas by using a different label 
extension (modeling case 3 above) is realized. 

Example j. Given the following rules in object obji 

h : p{X) ^ q{X) 
h : k{X) ^ r{X) 

if object objj inherits from obji and its IDB contains the rules 

h : p{X) ^ r{X) 
h ■ k{X) ^ q{X) 

then the rule for predicate p is overridden, whereas the definition of predicate k 
is extended. The resulting set of rules available in objj is 

h : p{X) ^ r{X) 
h : k{X) ^ r{X) 

h ■ k{X) ^ q(X) O 

Moreover, to express refinement (modeling case 4 above), a syntactic mechanism 
is needed that allows to specify that a rule is a refinement of a rule in the parent 
object. Rule bodies are thus extended to contain a special kind of atom (which we 
call inh-atom) of the form l^ : super. Referring to the objects and rules above, if 
BODYb contains the inh-atom l^ : super, then objj results in containing a single 
rule of the form 

{p{u))d £- {BODYa, BO DYb)^ 

where d = mgu{t,u), where heada = p{t), and headb = p{u)^. If heada and 
headb cannot be unified, then p is defined in objj only by rule headb ^ BODYb. 

® We impose no condition on rule heads for refinement. That is, we do not require 
that headb is at least as instantiated as heada. Note indeed that head refinement is 
not particularly meaningful in a context like ours where function symbols are not 
supported. 




Inheritance in a Deductive Object Database Language with Updates 



75 



Example 5. Referring to the rules of object obji of Example 4 above, a refinement 
of the rule labeled by I 2 can be accomplished by an object objk, inheriting from 
obji, by the following rule: 



I4, : k{X) q{X), I2 : super 
In such a way, the resulting rule in objk will be 

U:k{X)^q{X),r{X) 

which is a refinement of the original rule. O 

Labeled rules thus allow one to represent all modeling cases previously illust- 
rated. Consider objects obji and objj, such that objj inherits from obji, and a 
predicate p defined in obji, and let us show how the different options can be 
realized. 

1. Simple inheritance: It is sufficient that objj does not contain any clause 
defining p. 

2. Overriding: For each rule defining p in obji, having label lx, there must exist 
a rule defining p in objj whose label is equal to lx ■ 

3. Extension: All rules defining p in objj must have labels different from all the 
labels associated with the rules defining p in obji . 

4. Refinement: In objj all rules defining p must contain the inh-atom lx : super, 
where lx is the label associated with the rule defining p in obji and whose 
body must be put in conjunction with the bodies of the rules in objj . 

Note that through the mechanism above we can refine only some of the rules 
defining a predicate p, or all the rules defining it, depending on the intended 
behavior we want to associate with the inheriting object. Note that, moreover, 
when a rule labeled by lx is refined in an inheriting object rule through an inh- 
atom lx : super in a rule labeled by ly, it is also inherited by the object. To 
prevent this inheritance, the object must contain another rule labeled by lx (for 
instance, it can simply be ly = lx)- 

Finally, we remark that semantically meaningful labels can be exploited to 
make evident which kind of behavior is being specified for a given rule. For 
instance, labels of rules introducing new predicates may have a new prefix, labels 
of rules extending an inherited predicate may have an ext prefix, and so on. 
The following definitions formalize the notion of labeled rule. 

Definition 7. (Inh-Atom). An inh-atom has the form lx '■ super with lx € C. O 

We remark that super is a special 0-ary predicate symbol, which cannot appear 
in any other kind of atoms of our language. 

Definition 8. (Labeled Rule). A labeled rule is a rule as in Definition 5, labeled 
by a label I € C; that is, a labeled rule has the form I : r where r is defined as 
in Definition 5 extended to contain in its body, in addition to deduction, action, 
and labeled atoms, inh-atoms as in Definition 1. □ 
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By using labeled rules we are able to specify what is inherited, what is hid- 
den, and what is refined with the clause granularity, instead of the predicate 
granularity. Thus, a partial overriding of a predicate is allowed. 

An Obj*"^-Datalog database consists of a set of objects related by an inhe- 
ritance hierarchy. The intensional component of each object is a set of labeled 
rules. Extensional facts, by contrast, are not labeled, so that only simple inheri- 
tance and overriding are supported on facts. 

Definition 9. (Database). An Obf'^^-Datalog database is a pair 
O-DB = {{obji,obj 2 , . . . , objs}, <) 
where: 

— {obji,obj 2 , ■ ■ ■ ,objs} is a set of objects according to Definition 6 such that 
the intensional component IDBj of each objj, 1 < j < s, is a set of labeled 
rules, as in Definition 8, whose labels are all distinct; 

— Bo X Bo is a relation on objects representing the inheritance hierarchy. 

Since we consider only single inheritance, the inheritance relationship -< is 
a tree, that is, if objects obji,objj,objk (1 < i,j,k < s) exist such that 
objj -< obji and objj -< objk, then either obji -< objk or objk -< obji. □ 

Given obji, objj G Bo, objj -< obji denotes that object objj inherits from ob- 
ject obji. Moreover, ^ denotes the partial order obtained from the non-refiexive 
relation that is, objj ^ obji denotes the relation objj -< obji V obji = objj. 

Note that we restrict ourself to single inheritance, to avoid name conflicts 
that will introduce unnecessary complications in the definition of the language, 
without bringing in any relevant issue with respect to the main focus of the paper. 
The approach can however be extended to multiple inheritance, by adopting one 
of the existing approaches to handle name conflicts, such as superclass ordering 
or explicit qualification. 

Example 6. {{obji,obj 2 ,obj 3 },<) is an example of Obj“^-Datalog database, 
with objs ^ obji and 

EDBi = q{a) r{h) s{obj2) 

IDBi = h : p{X) ^ -q{X), q(X) 

h : k{X,Y) ^ s{X),r{Y),X ■. h(Y) 
h : t{X) ^ r{X) 
h ■ mr{X) ^ -r{X),r{X) 

EDB2 = h{b) 

EDBs = f(b) q{b) 

IDEs = h-.p{X)^-q{X),f{X) 

U : t{X) ^ f{X) 
le:mfiX)^-f{X),f{X) 
h : k{X,Y) ^ I2 : super, f{Y) 
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Referring to the inheritance relationship between o 6 ji and objs, we point out 
that predicate p is overridden, predicate k is refined, predicate t is extended, 
predicate mr is simply inherited, while predicate mf is an additional one. For 
what concerns the extensional component, 0 &J 3 simply inherits facts r(b) and 
s{obj 2 ) from obji, whereas it overrides fact q{a) by providing a local definition 
for predicate q (that is, fact q{b)). O 

Note that our inheritance mechanisms based on rule labeling offer a number of 
alternatives with respect to redefinition of predicates (e.g. partial overriding, rule 
addition, rule refinement). Such alternatives could not be supported at the rule 
level if labeled rules were not provided. Our mechanism obviously requires that 
rule labels are visible in subclasses, thus requiring that predicate definitions not 
be encapsulated with respect to subclasses. However, note that some solutions 
can be devised to the problem of encapsulation. Our language could easily be ex- 
tended to support both labeled and un-labeled rules, with a traditional overriding 
model on a per-predicate basis for some predicates. In such a way, when defining 
a class, the user can decide whether to encapsulate a predicate definition with re- 
spect to subclasses (by not labeling the rules defining it) or to let its definition be 
visible to subclasses, in which different choices can be adopted for inherited rules. 

We also point out that overriding of extensional facts is supported in our 
model. However, since extensional facts are not labeled, on the extensional side 
overriding works on a per-predicate basis. This means that is not possible to 
inherit a fact on a predicate and to override another fact on the same predicate. 
We took this decision since requiring fact labeling seems an unnecessary burden 
for the user, given that the sophisticated mechanisms provided for rules (useful 
for code reuse) does not seem very useful at the data level. 

We finally remark that we do not consider here the issue of dynamically 
creating and deleting objects from the database, which thus consists of a fixed set 
of cooperating objects. This possibility can obviously be added to the language, 
but the deletion of objects from which other objects inherit must be handled 
carefully (as in all prototype-based languages). 

2.3 Queries 

An important requirement of our language is to model both queries typical 
of deductive databases, as well as queries typical of object-oriented databases. 
In deductive databases, a query (goal) has usually the form ?pi(ti), . . . ,p„(t„) 
(n > 1) where each pi{ti) (1 < z < n) is a deductive atom. The meaning of 
such query is to find all substitutions for the variables in the query so that 
the conjunction of predicates pi(ti), . . . ,p„(t„) has the truth value True. On 
the other hand, in object-oriented databases, queries are usually addressed to a 
specific object, in form of messages. To support all above querying modalities, 
two different types of queries are defined: 

1 . Conjunction of deduction atoms: The meaning of this type of query is to find 
all solutions satisfying the query, independently from the objects where the 
deduction atoms, appearing in the query, are defined. 
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2. Conjunction of ohject-laheled deduction atoms: The meaning of this type of 
query is to find all solutions satisfying the query starting from the objects 
whose OIDs appear in the query. Note, however, that an object may need 
to send messages to other objects in order to answer the query. 

Note that, whenever a query of the first type is issued, the objects on which 
the query applies may not be related by inheritance relationships. We refer to 
Obj“^-Datalog queries as transactions to emphasize that, since update methods 
can be invoked, they do not only return sets of bindings but they can also modify 
the database state. However, updates are not defined in the query but only in 
the object methods (expressed by rules). 

Definition 10. (Transaction). A transaction has the form 

? 5 , 5 ^= 



where 



— B = Bi, . . . is a vector of deduction atoms, that is, they refer to any 
object of the object database, 

— B‘^ = obji : B[, . . . ,objz : B( is a vector of c-labeled atoms, that is, they 
refer to specific objects, 

and B and cannot be both empty. □ 

Note that no updates are explicitly stated in the transaction because each object 
uses its own methods (rules) to manipulate the object state. 

Example 1. Examples of transactions are Ti = objs : k{X,Y),obj\ : t{Y) and 
T2=p{X). <> 

3 Semantics 

The semantics of Obj“^-Datalog language is given in two steps. The first step is 
called marking phase and the second one update phase. The first step is similar 
to the query-answer since it computes the bindings for the query and collects 
the updates. Updates are not executed in this phase. They are executed, if there 
are not complementary updates, in the second phase altogether, modeling the 
expected transactional behavior. 

3.1 Marking Phase Semantics 

In this section we model the behavior of a transaction execution. We forma- 
lize the rules for evaluating a call taking into account the different options for 
behavior inheritance we support. 

A transaction may contain two kinds of atoms: labeled and unlabeled ones. 
The labeled atoms must be refuted in the object whose identifier labels the atom. 
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while for unlabeled atoms a refutation is searched for in any object in the object 
database. 

The behavior of a predicate call in an object depends on the labels of the 
rules defining the predicate in that object as well as on the inheritance hierarchy. 
In case of overriding, the notion of most specific behavior is applied, that is, each 
object inherits a predicate from the closest ancestor in the hierarchy that contains 
a definition for that predicate. By contrast, in the case of extension, an object 
may inherit the union of the definition of a predicate in all its ancestors in the 
hierarchy. In case of refinement, finally, the rule obtained by combining the rule 
in the most specific object with the referred rules in its ancestors is exploited. 

The combination of those mechanisms results in the following rule for eva- 
luating a call. Consider a transaction T = p{t) to be evaluated in an object 
obji belonging to a database ({o6ji, . . . , objs), -<)■ The evaluation of T proceeds 
according to the criteria outlined below: 

1 . if p G is an extensional atom: 

a) if p is locally defined in obji, then the definition in obji is used; 

b) the definition in the closest ancestor of obji that defines p is used, other- 
wise; 

2. if p G iT® is an intensional atom: 

a) if p is locally defined in obji, then the definition in obji is used; moreover, 
if p is also defined in an ancestor objj of obji, and the labels of the rules 
for p in obji and in objj are different, then the p definition of objj is used 
as well; 

b) if p is not locally defined in obji, then the definition in the closest ancestor 
of obji that defines p is used; 

c) if p is locally defined in obji (or defined in an ancestor objj of obji, 
and not overridden), and its definition is a refinement (that is, is a rule 
containing inh-atoms), then the refined definition for p is used. 

Labeled atoms are evaluated by simply changing the evaluation context to the 
object denoted by the label. Action atoms are evaluated by simply adding the 
action to the appropriate update set. 

The operational semantics of Obj®"^-Datalog is given below. For any database 
O-DB = {{obji,..., objs},<) and transaction T, we denote by O-DB 
T the fact that there is a derivation sequence of T in O-DB with answer § 
and collecting a tuple of update sets S. We reserve the symbol e to denote the 
empty (identity) answer, whereas 'dd' denotes the composition of substitutions 
■d and . Moreover, let S and S' be s-tuples of update sets, S VJ S' denotes the 
componentwise union of update sets, that is, (S' U S") { i = S J, i U S' j, i, 
for all i, 1 < i < s. A set of updates {ui, . . . ,u„} is consistent if it does not 
contain complementary updates (i.e. -l-p(a) and —p(a)). A tuple of update sets 
S is consistent if all its component update sets are consistent, that is, if for all 
*, 1 < * S s, in S j, z there are no complementary updates. 

The derivation relation is defined by rules of the form 

Assumptions 

—pr C onditions 

C onclusion 
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asserting the Conclusion whenever the Assumptions and Conditions hold. O- 
DB T is a finite successful derivation of T in O-DB that computes d and 
collects S. A successful derivation is computed as a sequence of derivation steps. 
Each derivation step is performed according to the rules in Fig. 1. 

The index i denotes the current context, that is, the object of the database 
in which the computation is being carried on. It is not present in the first rule, 
and in the Conclusion of rules 2 and 3, modeling the fact that a query is issued 
against the whole database, and the selection of the evaluation context depends 
on the query. 

Rule 1 models the semantics of queries, which are conjunctions of two sub- 
queries, in terms of the semantics of the subqueries. Rules 2 and 3 model queries 
which are deduction atoms and object-labeled deduction atoms, respectively, ac- 
cording to the meaning introduced in Section 2.3. For object-labeled deduction 
atoms in queries the evaluation context is set to the object labeling the atom 
(Rule 3), while for unlabeled deduction atoms a refutation is looked for in any 
object of the database (Rule 2). Rule 4 models the semantics of action atoms 
(the atom is simply added/removed to the set of updates related to the current 
object). Rule 5 handles the empty conjunction, that is, an empty rule body. Ru- 
les 6 and 7 handle c-labeled atoms and v-labeled atoms, respectively, modeling 
the change of evaluation context. Rule 8 handles extensional atoms: conditions 
(a) and (b) are related to the two possibilities for evaluating extensional atoms 
in presence of inheritance hierarchies: the fact is locally defined in the current 
object or it is simply inherited from a most specific ancestor of its. Rule 9 handles 
intensional atoms: 

— Condition (a) refers to the case of a predicate which is defined locally to the 
current object (either a new predicate definition, or overriding) and to the 
case of predicate extension; it states that all the local rules and each rule 
for predicate p in ancestors of object obji whose label does not appear in a 
more specific ancestor of obji, or in obji itself, can be used in the refutation. 

— Condition (b) handles simple inheritance, that is, it considers the case in 
which the current object does not provide a definition for the predicate in 
the goal. In this case, the definition for that predicate in the most specific 
ancestor objj of the current object is employed. 

— Condition (c) models refinement. It considers a rule which is applicable for 
refutation (as the ones in the cases above) and solves the inh-atoms in it. 
This means looking for the rule labeled by ly (where ly is the label of the 
inh-atom ly : super) in the most specific object objk from which the object 
objj (containing the rule we are solving) inherits, and then substituting the 
inh-atom with the body T'" of that rule, properly instantiated. 

Finally, Rule 10, which is similar to Rule 1, handles conjunctions in rule bodies. 

The operational semantics of an Obj*"^-Datalog database O-DB is defined as 
the set of ground atoms, for which an Obj*"^-Datalog proo/ exists. These ground 
atoms are constrained by the set of ground updates their deduction collects, that 
is, Sd. The semantics of an Obj“^-Datalog database consists of atoms of the 
form H U, where is a ground atom (either intensional or extensional) and 
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( 1 ) 

(2) 

( 3 ) 

( 4 ) 

( 5 ) 

(6) 

( 7 ) 

(8) 



{{obji,.. .,objs}, ■<) Ltf,s Ti {{obji,. ■ .,ohjs},<) T 2 ’& 

({obji, • • • ,objs}, <) L sus' 7 i,T2 



CONS 



i, {{obji, . . .,objs}, -<) 1-^,5 p{i) 
{{obji,...,objs},^)\-i),s p(t) 

i, {{obji, ■ ■ ■ , objs}, pji) 

{{obji, • • • , objs}, -<} obji ■■ p{i) 



i, {{obji,. ■ ■ , objs}, K,s ®p(i) 



Sii 



{®p(t)}, S } k = 0,\/k = 1 . . . s,k i 



i, {{obji, ■ ■ -,objs}, ■<} □ 

j, {{obji,. ■ ■ , objs}, ®) p{i) 
i, {{obji,. ■ ■ , objs}, ®) htf,s objj : p{i) 

i, {{obji, . ■ ■,objs}, ®) htf,s T j, {{obji,. ■ .,ohjs},-<) p(t)d 

i,{{obji,...,objs},-<} \-M',sus' T,X : p{i) 

i, {{obji,. ■ -,objs}, ■<} p{i) ^ ^ 



if one of the following conditions holds: 

(a) p(s) e obji, mgu{s,i) = i9; 

(b) p(s) G objj, i / j, mgu{s,t) = d, obji ^ objj and Mobju such that obji ^ 
objk -< objj in objk an extensional fact which unifies with p{t). 

(g) i,{{‘)bjl,...,objs},^} ha,S rd p^jji 

i, {{obji, . ■ -,objs}, ■<) L^,j,s p{t) 



if one of the following conditions holds: 

(a) lx : p(s) ^ T G objj, mgu{s,t) = d, obji ^ objj, T does not contain inh- 
atoms, and Mobjk such that obji ^ objk ~< objj and ly : p{u) ^ T' G objk, 

lx ^ ly , 

(b) lx : p(s) ^ T G objj, i ^ j, mgu{s,t) = d, T does not contain inh-atoms, 
obji d: objj and Vobjk such that obji ^ objk -< objj J^r in objk whose head 
unifies with p{t); 

(c) lx ■ p(s) <— T' E objj, ly : super G T' , T' \ ly : super = T" , ly : p{u) -E- 
T'" G objk, (y/objp such that obji ^ objp -< objj and ly : p{w) ^ T' E objp, 
lx ^ ly), objj -< objk and not exists objh such that objj -< objh and 
objh -< objk, and mgu{s,u) = d*, mgu)^^* ,i) = d, T = T"§* . 



( 10 ) 



i, {{obji, ■ . . ,objs}, ■<} Ai i, {{obji, ■ ■ ■,objs}, ®) ^ 2 ^ 

h {{obji, . . ■ ,objs}, -<) \-M',SUS' Ai,A 2 



CONS 



With: ® G {+,-}, CONS is S'dij' U S' consistent, CONDj is objj = XijACONS. 



Fig. 1. Rules defining the derivation relation 
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U are updates. The presence of the atom H <— U in the semantics means that 
H is true and that its evaluation causes the execution of the updates U. 

Definition 11. (Operational Semantics). The operational semantics of an 
Obf^^ -Datalog database O-DB is defined as the set 

0{0-DB) = {A^U\ 0-DB_ T, A = T§,U = UiO . . .\JUs, 

each Ui, 1 < i < s, is the conjunction of obji : 
for Ui^ G S' 4 t} 



□ 



Example 8. Referring to the Obj“^-Datalog database O-DB of Example 6 and 
to transactions T\ and T 2 of Example 7, the following holds: 

- O-DB Ti with = {X/obj 2 ,Y/b}\ 

- O-DB T 2 with i ?2 = {X/a}, S 2 = ({-g(X)},0,0) 
and 

O-DB T 2 with = {X/b}, S 2 = (0, 0, {-g(^)}). 

The Obj“^-Datalog proofs for those transactions are shown in Fig. 2 and Fig. 
3, respectively. O 



( 8 b) 

3, O-DB r{b) 
( 8 b) 



( 8 a) 

3, O-DB h .,0 f[b) 
( 10 ) 



3, O-DB s(X) 3, O-DB h ,.0 r( 6 ), f[b) 

( 8 a) ( 10 ) 

2. 0- DB h{Y) 3, O-DB s(X),r{b), f{b) 

(T) 

3. 0- DB s(X),r{Y),X : h(Y),f{Y) l,O-DBh .,0 r{b) 

(9c) ^ ^ 

3 , 0 -DB fc(X,y) 1 , 0 -DB h .,0 

(3) 

O-DB G^j ^0 objs : k(X,Y) O-DB (-,,0 obji : t(b) 

O-DB htfj ,0 objs : k{X,Y),obji : t{Y) 

< = {Y/b} 1 ?'/ = {xjobjA til = = {y/ 6 , xjobjA 



( 8 a) 

(9a) 

(3) 

( 1 ) 



Fig. 2. Obj™^-Datalog proof for transaction T\ of Example 7 




Inheritance in a Deductive Object Database Language with Updates 



83 



(4) 



l,0-DB K,s, -q{X) 



1, 0-DB h ^ 2^0 q{^) 



1,0-DB -q{X),q{X) 



(9a) 

1, O-DB h^ 2 _s 2 P{^) 

( 2 ) 

O-DB 1-^2, S 2 p(X) 



- (8a) 

( 10 ) 



(4) (8a) 

3, O-DB -q{X) 3, O-DB ,9 f[X) 

( 10 ) 

3, O-DB -q{X)J{X) 

(9a) 

3, O-DB h^^.s^p(X) 

( 2 ) 

O-DB p{X) 



d2 = {X/a} 52 = ({-g(X)},0,0) d^ = {X/6} S^ = {0,0,{-g(X)}) 

Fig. 3. Obj'"^-Datalog proof for transaction T 2 of Example 7 



3.2 Update Phase Semantics 

As we have said, in the marking phase updates are collected and their consistency 
is checked but they are not executed. The most common approach to introduce 
updates in declarative rules is that updates (very often defined in rules bodies) 
are executed as soon as they are evaluated [20] . Under this assumption the eva- 
luation of a rule is performed in a sequence of states and thus the declarativeness 
of the query part is lost. Under the marking and update phases, the first phase 
is declarative and preserve this property for the query part while accommoda- 
ting update specification that are executed altogether in the update phase. This 
allows one to express within this semantics the transactional behavior where all 
or none of the updates must be executed. At logical level, this semantics avoids 
to undo updates that form a transaction. Indeed, updates collected with the 
marking phase must be executed in the update phase and it is not possible that 
some of them will be undone due to the checking in former phase. Let us now 
see the update phase semantics. 

First of all we define the semantics of a query T with respect to an Obj“^- 
Datalog database O-DB. First we note that database systems use a default 
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set-oriented semantics, that is, the query-answering process computes a set of 
answers. We denote with Set{T,0 — DB) the set of pairs (bindings and updates) 
computed as answers to the transaction T. 

Set{T, O-DB) = {(r?, u) \ O-DB T,u = S§} 

We now define a function which takes a set of ground updates, the current 
extensional components of the objects constituting the database and returns the 
new extensional components. 

Definition 12. Let EDB \^ , • • • , EDBl‘ be the current extensional components 
of the objects constituting the database and Ui,. . . ,Ug be a s-tuple of consistent 
sets of ground updates. Then the new databases EDB[^'^^ , . . . , EDBl‘~^^ are 
computed by means of the function A : £C x U ^ SC as follows: 

A{{EDBl \. . . , EDBl‘), {u ,, . . . , u,)) = {EDB[^+\ EDBl‘+^) 
where each EDB’f’^^ , with j = 1 ... s, is computed from EDBj^ and Uj as 

{EDB"f \ {p{i) I -p{i) G Uj}) U {p{i') \ +p{i') G Uj} 
where SC denotes all possible s-tuples of extensional components (i.e. of sets of 
facts) andlA denotes all possible s-tuples of updates sets. □ 

The update phase semantics models as observable property of a transaction the 
set of answers, the object states and the result of the transaction itself. It is 
called Oss = {Ans, State, Res) where Ans is the set of answers. State is an 
s-uple constituted by the extensional components of objects in the database 
and Res is the transactional result, that is, either Commit or Abort. The set of 
possible observables Oss is OSS. 

Definition 13. Let O-DB^'^ be an Obf^^-Datalog database, with EDB^ the tu- 
ple of current object states and O — LDB the tuple of method sets of objects. The 
semantics of a transaction is denoted by function So-idb{T) : SC — >■ OSS. 

( if OK 

So-idb{T){EDB^) = \ 

y {%, ED B'‘ , Abort) otherwise{inconsistency) 

where | {'&j,Uj) G Set{T,0-DBi)}, ED B^~^^ , Commit) , EDB"^^^ 

is computed by means of A{EDB^,u). The condition OK expresses the fact that 
all the components of the tuple of sets u = (J^- Uj are consistent, that is, there 
are no complementary ground updates on the same object. □ 

Note that, according to the above definition, in Obj“^-Datalog the abort of a 
transaction may be caused by a transaction that generates an update set with 
complementary updates on the same atom in the same object (both the insertion 
and the deletion of the atom). In this case the resulting object state would depend 
on the execution order of updates, so we disallow this situation by aborting the 
transaction. In such a way we ensure that the defined semantics is deterministic. 



Here we denote with O-DB^ the ObJ^^^-Datalog database to emphasize that we 
consider object states EDB' at time i. 
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Example 9. Referring to the object database of Example 6, and to transaction 
T 2 of Example 7, whose answers have been computed in Example 8 

A{{EDBuEDB 2 ,EDB:i),{{-q{a)},%,{-q{h)})) = {EDB[, EDB^, EDB'^) 
with 

~ EDB'i = {r{b), s{obj 2 )} and 

~ EDB'^ = {f{b)}. 

Moreover, 

Sidb{EDB, T 2 ), {{{X/a}, {X/b}}, EDB', Commit). O 

Note that the update phase semantics specified in [24] for U-Datalog hand- 
les transactions composed from atomic transactions, as the ones we support, 
through the sequence operator. That semantics can trivially be extended 
to Obj“^-Datalog, by taking into account that tuples of object states and tu- 
ples of update sets must be considered rather than a single database state and 
a single update set. 

4 Obj*”^^-Datalog Interpreter 

A prototype implementation of the Obj“^- Datalog language has been develo- 
ped at the University of Genova, using KBMSl, a knowledge base management 
system developed in HP laboratories at Bristol [21]. The language of KBMSl, 
kbProlog, is an extension of Prolog with modularization facilities, declarative up- 
date operations and persistence support. The implementation of the language has 
been realized in two steps: (i) development of a translator from Obj“^-Datalog 
to U-Datalog; (ii) development of a bottom-up interpreter for U-Datalog. The 
bottom-up interpreter for U-Datalog handles updates with a non-immediate se- 
mantics and provides the transactional behavior. The use of a bottom-up evalua- 
tion strategy ensures termination. The choice of implementing Obj“^-Datalog 
via a translation in U-Datalog is due to the fact that the definition and imple- 
mentation of Obj“^-Datalog is part of a project which aims at developing an 
enhanced database language, equipped with an efficient implementation. Several 
optimization techniques for U-Datalog have been developed [4] that will lead 
to an optimized U-Datalog interpreter and therefore to an optimized Obj“^- 
Datalog interpreter. 

An alternative implementation might realize a “direct” interpreter for Obj*"^- 
Datalog, adapting one of the several evaluation techniques developed for deduc- 
tive databases to object deductive databases (so taking into account message 
passing, object state evolution and method inheritance). This is a possible issue 
for future investigation. Our prototype is based on the following steps: (i) an 
Obj“^-Datalog program OP is translated into an Obj-U-Datalog [5] program 
OP' , that is, inheritance relationships are eliminated and each object is exten- 
ded so that it explicitly contains its structural and behavioral information (thus. 
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flattening the inheritance hierarchy); (zz) the Obj-U-Datalog program OP' is 
translated into a U-Datalog program UP] (Hi) each Obj“^-Datalog query OQ 
is first of all translated in a U-Datalog query UQ, and then executed against the 
program UP using the U-Datalog interpreter. In what follows, we describe each 
step. 

Step 1: Flattening the inheritance hierarchy This step makes explicit 
the set of facts and rules available for refutation in each object of the object 
database. Consider an object objj whose direct parent object is object obji. 

— EDB{ is obtained from EDB^ as follows 

EDBj = EDBj U { p{c) \ p G EU, c tuple of constants in E, p(c) G EDB{ , 

EDBj does not contain any fact on predicate p} 

— IDBj is obtained from IDBj as follows 

— IDBj contains all the rules of IDBj, whose bodies are modified by 
solving the inh-atoms in rule bodies; 

an inh-atom lx ■ super is solved by replacing it with the body of the rule 
labeled by lx in the parent object obji, after having properly unified the 
rule heads and applied the obtained mgu to the rule body®; 

~ IDBj contains all rules of IDB( whose labels do not appear in £{objj). 

The flattening process described above is recursively applied starting from the 
objects roots of the inheritance hierarchy (that is, the objects objj such that 
ji obji objj -< obji), and visiting the inheritance tree in a top-down style till the 
leaves of the tree are reached. 

At the end of the flattening process, the Obj“^-Datalog rules are transformed 
in Obj-U-Datalog rules by omitting the rule labels. 



Step 2: Translation of facts and rnles The translation from Obj-U-Datalog 
to U-Datalog is simple. For each object obji G O-DB, for each predicate p of 
arity n defined in obji we have a corresponding predicate p of arity n+\ defined 
in U-Datalog DB. The argument added to each predicate refers to the object 
in which the predicate is defined. The extensional component of an object obji, 
i.e. EDBi, is translated as follows. For each fact in EDBi, p{a), with a tuple 
of constants, we have a fact p{obji,a) in DB. The extensional database of the 
U-Datalog program consists of the union of the translation of the extensional 
components of each object. 

The intensional rules are translated as follows. Consider the rule, defined in 
object o6A, 

P{H) ^ Di (id )?■■■? Pk (ifc) ^ obji . Bj^j-i (idc_|_i ) , , objxi . (id-t-n) 1 

All ■ Dk+n+l{^k+n+l) ^ • t Up ■ Dk+n+p{^k+n+p) • 

® If the rule heads cannot be unified, the inh-atom is simply removed from the IDBj 
rule body. 
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This rule is translated in the following U-Datalog rule: 

p{obji,X) ^ Bi{obji,Yi), . . . , Bk{obj^,Yk), Bk+i{obji,Yk+i), ■ ■ ■ , 

Bk+n{objm Yk^n)} Bk-\-n+l{Xi^ Yk-kn+l) i : Bk+n+p{Xp^ Yk-\-n+p) ■ 

The intensional database of the U-Datalog program consists of the union of the 
translations of all the rules of the intensional component of each object. 

Step 3: Translation of transactions A transaction is translated in the con- 
junction of the translation of (eventually labeled) atoms that constitute it. A 
labeled atom obji : p{X) is translated in a U-Datalog atom p{obji,X). An un- 
labeled atom p{X) in a transaction, that -as we have seen- is interpreted as a 
transaction directed to the whole database, is translated in p{0,X), where O 
is a new variable. Note that in this way we obtain in the solution not only the 
instances of p{X) satisfied by the database, but also the objects in which such 
instances were found. 

Example 1 0. The U-Datalog program resulting from the translation of the object 
database of Example 6 is the following. 

EDB = q{obji,a) r{obji,b) s{obji,obj 2 ) h{obj 2 ,b) 

f{obj 3 ,b) q{objib) r{obj 3 ,b) 5(0673,06^2) 

IDB = p{obji,X)< q{obji,X),q{obji,X) 

k{obji,X,Y) ^ s{obji,X),r{obji,Y),h{X,Y) 
t{obji,X) <- r{obji,X) 

mr{obji,X) i r{obji,X), r{obji,X) 

p(obj3, X) < q{obj 3 ,X), f(obj3, X) 

k{obj3,X,Y) ^ s{obj3,X),r{obj3,Y),h{X,Y)J{obj3,Y) 

6(0673, A) 5(0673, A) 

6(0673, A) ^/(o673, A) 

mr{obj 3 , A) < r{objs, A), r{obj 3 ,X) 

mf{obj 3 ,X) ^ -/(o673,A),/(o673, A) 

Moreover, the transactions of Example 7 are translated as follows: 

- Ti = k{obj 3 ,X,Y),t{obji,Y); 

~T2=p{0,X). O 

5 Related Work 

Several research proposals attempt to combine object-orientation, databases, 
and logical languages. There are different orthogonal dimensions along which 
the approaches to the integration of the deductive and object paradigms may be 
classified. A survey of those proposals can be found in [5]. 

Most of the approaches do not consider state evolution of deductive objects. 
More precisely, the characterization of objects as logic theories, coming from 
object-oriented extensions of logic programming, does not account for any no- 
tion of state. McCabe suggests that the change of state for an instance can be 
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simulated by creating new instances [22] . Other proposals simulate state changes 
by using assert and retract but this approach lacks any logical foundation. In [10] 
intensional variables are introduced to keep trace of state changes without side 
effects. In other proposals, multi-headed clauses are used for similar purposes. 
However, the notion of updating object state does not fit well in object-oriented 
extensions of logic programming. In addition, also approaches developed in the 
database field, like e.g. [13,14,19], do not consider state evolution. Many of the 
approaches [1,14,8], moreover, do not consider the behavioral component of ob- 
jects, that is, methods. We think that this is an important issue because it 
overcomes the dichotomy between data and operations of the relational model. 

Few proposals moreover, deal with behavioral inheritance and overriding. In 
addition to [2,19,22], these topics have been addressed in [7,11,15]. All these 
proposals extend F-logic [17] (or F-logic variations) with behavioral inheritance. 
In F-logic, indeed, only structural inheritance is directly captured. For beha- 
vioral inheritance, the non-monotonic aspects introduced by the combination 
of overriding and dynamic binding are modeled only indirectly by means of an 
iterated fixpoint construction. Moreover, in F-logic, only ground data expressi- 
ons, that is, values resulting from the application of a method, and not method 
implementations, can be inherited along the inheritance hierarchy. 

In GuLog [11] overriding and conflicts arising from multiple inheritance are 
investigated, in a model similar to F-logic. In GuLog, however, the schema and 
instance levels are separated. In ORLog [15] overriding and withdrawal of pro- 
perties are supported. Withdrawal is used to prevent the inheritance of some 
properties in subclasses. It can thus result in non-monotonic inheritance of sig- 
natures. A reasonable use of that mechanism is for preference specification for 
conflict resolution in case of multiple inheritance. In [7] Bugliesi and Jamil also 
deal with the behavioral aspects of deductive object languages. Their language, 
moreover, also allows dynamic subclassing, that is, the definition of inheritance 
relationships through rules on schemas (which are not allowed in GuLog and 
ORLog). Dynamic subclassing raises non-monotonicity problems and leads to 
the introduction of a notion of i-stratification to guarantee the existence of a 
unique stable model. All these proposals, however, despite of their differences, 
deal with overriding on a per-predicate basis and do not consider any form of 
state evolution®. 

A finer granularity of rule composition is offered by languages supporting 
embedded implication [3,12,23]. Embedded implication allows one to realize also 
some of the other features of our language (such as message passing and conserva- 
tive inheritance), but does not account for all of them (for instance, overriding). 
We remark, moreover, that our way of supporting such features is very closely 
related to the basic modeling notions of the object paradigm. This makes it 
easier to develop rule sets and to reuse them. 



Actually, state evolution can be accommodated in F-Logic through Transaction lo- 
gic [6], as discussed in [16]. Also in this case, however, predicate inheritance and 
overriding work on a per-predicate basis. 
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6 Conclusions 

We have proposed an approach to express inheritance in deductive object da- 
tabases. Deductive object databases are based on deductive objects that can 
change state. Cooperation among objects is defined by inheritance and message 
passing. Several types of inheritance have been investigated and a formal opera- 
tional semantics for the language is given. This semantics models objects with 
the granularity of theory, updates, methods, message passing, and inheritance as 
well as transactional behavior. Finally, a prototype has been implemented and 
a sketch of the interpreter for Obj*"^-Datalog is provided. 

Our main direction of future work concerns the investigation of the applicabi- 
lity of the proposed approach to other deductive object languages with updates 
(such as Transaction F-Logic) and to other declarative object models that allows 
to specify dynamic aspects. 
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Abstract. The Co-nets approach that we are developing is an object- 
oriented (OO) specification model based on a formal and complete in- 
tegration of OO concepts and constructions into an appropriate variant 
of algebraic Petri nets. Interpreted in rewriting logic, the approach is 
particularly tailored for specifying and validating advanced information 
systems as distributed, autonomous yet cooperative components. Howe- 
ver, in spirit of most existing conceptual models, the Co-nets approach 
requires that all system aspects have to be known during its specification 
and fixed at once; a fact going in contrast to reality where most systems, 
due to different changes in business and law factors, have to change their 
behaviour in unexpected way during their long life-span. With the ob- 
jective to overcome this crucial limitation, we present in this paper first 
steps towards an appropriate extension of Co-nets approach for natu- 
rally dealing with specification evolution. The main ideas are based on, 
first, distinguishing between a rigid, fixed object behaviour part and a 
modifiable one. Second, besides usual transitions and places, we intro- 
duce the notions of meta-places and meta-transitions for dynamically go- 
verning the modifiable behaviour. Third, we propose for meta-transitions 
two-steps (i.e. meta- and object levels) valuated rewriting rules. 



1 Introduction 

Present-day information systems are becoming more and more complex in size 
and more especially in space. For their crucial specification / validation phase, 
there is an overwhelming need for more appropriate object-oriented conceptual 
models [19]. Indeed, in contrast to (mostly sequential) existing OO specification 
models which conceive such systems as community of objects, advanced models 
have to conceive them rather as fully distributed, autonomous yet cooperative 
components. Each component has to be regarded at least as a hierarchy of classes 
with different forms of inheritance (i.e. simple, multiple, with overriding), object 
composition, and aggregations. Distribution in such components has to be re- 
flected by a true intra- as well as inter-object concurrency, and by exhibition of 
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different forms of communication including synchronous and asynchronous ones. 
Autonomy, which has to be coupled with close cooperation, should particularly 
be reflected by an encapsulation of proper features in each component and by 
the existence of explicit interfaces for interacting such components. 

Besides these challenges, the capability of specifying evolving aspects in such 
systems is also of crucial importance. In fact, due to the very long life-span 
(usually several years) of these systems, which is subject to changes of laws as well 
as market pressure, advanced conceptual models also have to be semantically rich 
enough for naturally capturing this dynamic, runtime evolution. Such evolution 
must be achieved without resorting to conceive (from scratch) new systems or 
modifying in ad-hoc manner their corresponding implementation — which lost 
the essence and crucial relevance of the specification phase. 

For tackling these issues, there are some ongoing approaches including, for 
instance, distributed temporal logic (DTL) [10] with Troll language [15] and 
Real-time Object Specification Logic (ROSL) [4] with Albert II language [9]. 

Besides these more property-oriented formal frameworks, we are developing 
a multi-paradigm approach referred to as Co-nets. It soundly combines ideas 
from object orientation [21] modularity and system interconnection [12,1], high 
level Petri nets [14], and rewriting logic [17]. The main features of our approach 
[2,3] for adequately specifying and validating advanced information systems as 
fully distributed, autonomous yet cooperative components may be summarized 
as follows: 

— We conceive a class as a module with a hidden part including structure as 
well as behaviour, and an observed part, including structural as well as be- 
havioural aspects. The observed part is used as interface for interacting with 
the environment and other classes. In each class, object states are mode- 
led as terms with identity and gathered into an appropriate (object) place, 
while with each method-invocation a corresponding (message) place is asso- 
ciated. Transitions reflect the body of such methods (i.e., effect of messages 
on object state to which they are sent); where appropriate splitting /recom- 
bination of object states are allowed for a full exhibition of intra- as well as 
inter-object concurrency. 

— An incremental construction of components, as a hierarchy of classes, using 
simple and multiple inheritance (with redefinition, associated polymorphism 
and dynamic binding), object composition and aggregation. Such compo- 
nents behave (i.e., their general transition form) with respect to an appro- 
priate intra- component evolution pattern that naturally supports intra- as 
well as inter-object concurrency. Moreover, due to the possibility of splitting/ 
recombining of object states, the modeling of different forms of inheritance 
neither necessitates any complex formalization nor it suffers from the well- 
known inheritance anomaly problem [16]. 

— For interacting different components and thereby constructing more complex 
systems as cooperative components, an adequate inter- component interac- 
tion pattern is proposed; it enhances concurrency and preserves encapsulated 
(i.e. hidden) features of each of the interacting component. 
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— By interpreting Co-nets behaviour in rewriting logic, rapid-prototyping 
may be generated. This can be achieved either using rewrite techniques in 
general [8] or current implementation of the Maude language [5] particu- 
larly. 

The main focus of the present paper is to enhance the Co-nets approach to 
deal with dynamic, runtime changes in a very comprehensible but nevertheless 
well-founded way. In some detail, our approach to dynamic modification — as a 
natural extension of the just mentioned Co-nets features — may be explained 
as follows: 

— Following the approach of [20,6] to specify evolving behaviour in information 
systems, we assume that not all object behaviour is subject to change. That 
is, some object behaviour part is rigid, fixed forever reflecting minimal pro- 
perties of the modeled application. On the other sides, due to our modeling 
of inheritance using splitting / recombination of object states at a need, in- 
cremental extensions of given specification by introducing new messages and 
new behaviour (as subclasses) do not affect in any way the already running 
behaviour. As a result of this, we make a clear distinction between specifica- 
tion extension and behaviour modification. The extension, as we mentioned, 
is rigorously handled at the object level. For coping with the behaviour mo- 
dification as a change in method bodies (i.e., the effect of messages on object 
states) we propose a meta-object level. 

— Besides the usual object and message places and transitions, we introduce 
in each component new constructions for capturing this behaviour modifica- 
tion. We referred them to as meta-places and meta-transitions , respectively. 
Each meta-place, associated with a given component, is containing as tokens 
appropriate behaviour which may be assigned to a given (meta-)transition 
at the runtime (i.e. at the moment of firing this transition). More preci- 
sely, token components in meta-places are conceived as tuples composed of: 
transition identifiers, input tokens and created tokens with their respective 
places and conditions for firing the associated transition. Henceforth, from 
the fact that such tokens can be created, deleted or modified using tran- 
sitions as in usual places, the behaviour modification is straightforwardly 
achieved. Meta-transitions are defined as non-instantiated transitions. Their 
input arcs, output arcs, and conditions are just specific variables. Therefore, 
only at the time of their firing that they receive appropriate instantiation 
and thereby corresponding behaviour from the meta-place. 

— For semantically and correctly interpreting the intended behaviour of this 
meta-object level, we propose an appropriate (and just one) inference rule 
that we called meta-rule. This rule allows to propagate the abstract beha- 
viour from the meta-object level (i.e. meta-places and meta-transitions) to 
the usual object level (i.e. places and transitions). At the firing time, the 
meta-rule effect consists in transforming a meta-transition to a usual tran- 
sition, such that the behaviour precisely comes from the meta-place. 

The remainder of this paper will be organized as follows. In section 2, we review 
main aspects of the Co-NETS approach using a simplified account specification. 
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In section 3, we discuss how behaviour modification is syntactically and seman- 
tically handled by extending Co-nets. We conclude this paper by some remarks 
and future work. 

2 The Co-nets Approach: An Overview 

The Co-nets approach is a new form of object-oriented Petri net-based model 
more tailored to the specification and rapid-prototyping of distributed informa- 
tion systems [3]. 

2.1 CO-Net: Template and Class Specification 

This section deals with the modelling of the basic concepts of the object-oriented 
paradigm, namely objects, templates, and classes. We first present the structure, 
or what is commonly called the object signature templates [11]. Then we describe 
how specification templates and classes are specified. 



Template Signature Specification. The template signature defines the struc- 
ture of the object states and the form of operations that have to be accepted by 
such states. Basically, in the Co-nets approach, we follow the general object sig- 
nature proposed for Maude [18]. Object states are regarded as terms — precisely 
as a tuple — and messages as operations sent or received by objects. However, 
apart from these general conceptual similarities, and in order to be more close 
to the aforementioned information system requirements, the 00 signature that 
we propose can informally be described as follows: 

— The object states are terms of the form 

{Id\atri : vali, ...,atrk : valk,atJ)Si : val[, ...,atJ>Sk' ■ val'g) 



where 

• /d is an observed object identity taking its values from an appropriate 

abstract data type OId\ 

• atri, ,.,atri^ are the local, hidden from the outside, attribute identifiers 

having as actual values respectively val\, ..,valk- 

• The observed part of an object state is identified by atJjSi, ...,atJ)Ss and 

their associated actual v alues are val'i, ..val'g. 

• Also, we assume that all attribute identifiers (local or observed) range their 

values over a suitable sort denoted Aid, while their associated values are 
ranged over the sort Value with Old < Value (i.e. Old as subsort of 
Value) in order to allow object valued attributes. 

— In contrast to the indivisible object state proposed in Maude, which avoids 
any form of intra-object concurrency, we introduce a powerful axiom, cal- 
led splitting / recombination axiom that permits to split (resp. recom- 
bine) the object state as needed. This axiom can be described as follows: 
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{Id\attrsi^ ,attrs2) = {Id\attrsi) © {Id\attrs2)- As we present in more de- 
tail later, first, it allows us to exhibit intra-object concurrency^. Second, it 
provides a meaning to our notion of observed attributes by allowing sepa- 
ration between intra- and inter-component evolution. Third, it allows us to 
drastically simplify the conceptualization of inheritance. 

— In addition of conceiving messages as terms — that consist of message name, 
the identifiers of the objects the message is addressed to, and, possibly, 
parameters — we make a clear distinction between internal, local messa- 
ges and the external as imported or exported messages. Local messages al- 
low to evolve object states of a given class, while the external ones allow 
communication between different classes by exclusively using their observed 
attributes. 

Following these informal descriptions and some ideas from [ 18 ], the formal de- 
scription of the object states as well as the class structures are given using an 
OBJ [ 13 ] notation. 

obj Object-State is 
sort Aid . 

subsort Old < Value . 

subsort Attribute < Attributes . 

subsort Id-Attributes < Object . 

subsort Local-attributes External-attributes < Id-Attributes . 

protecting Value Old Aid . 

op _ : _ : Aid Value — > Attribute . 

op _ : Attribute Attributes — >■ Attributes [associ. commu. Id mil] . 
op : Old Attributes — >■ Id- Attributes . 

op _ _ : Id-Attributes Id-Attributes — >■ 

Id-Attributes [associ. commu. Id mil] . 
vars Attr: Attribute ; Attrsi, Attrs2 : Attributes ; I: Old . 
eql I\attrsi I\attrs2 = I\attrs]_,attrs2 
eq 2 7 | mZ = I 
endo . 

obj Class-Structure is 

protecting Object-state, s-atri , . . . , s-atr„ , s-argn i , . . , s-arg;iii, 

. . . , s-argii 1 , . . . , s-argii n ... 
snbsort Id. obj < Old . 

snbsort Mesa, Mes;2 , . . . ,Mesji < Local_Messages . 
snbsort Mesei , Mese2 , • . • ,MeSee < Exported_Messages . 
snbsort Mesa, Mesi2 , . . . ,Mesa < Imported_Messages . 
sort Id. obj. Mesa, • . . ,Mesip . 

(* local attributes *) 

op _|airi : _, ,atvk '■ - '■ Id. obj s-atri . . .s-atr* 

— >■ Local-Attributes. 

^ attvi stands for a simplified form of atrn : vain, , atvik : vaUk- 
^ In the sense that two messages sent to the same object and acting on different 
attributes can be performed (i.e. rewritten) in parallel by splitting the two parts 
using this axiom. 
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(* observed attributes *) 

op _\atrbsi : ,atrbsy - '■ Id.obj s-atbsi ...s-atbsfc/ 

— >■ External-Attributes . 

(* local messages *) 

op msn: s-argn i ...s-argnii ->■ Mesa • ••• 

(* export messages *) 

op mSei : s-argei i . . .s-argei ei — > MeSei . ... 

(* import messages *) 

op msii : s-argii i ... s-arga a — Mesip . ... 
endo . 

Example 1. We present a very simplified Account description, where each ac- 
count that we can withdraw, deposit in or increase its interest is characterized 
by: its identifier as a composite of a number with bank name, its balance, a 
minimal limit of the balance, interest content, corresponding bank name and 
the account holder (identity). Following the afore described template signature, 
the corresponding account signature takes the following form, where all data 
types like nat, money, and string are assumed to be algebraically specified 
elsewhere. 

obj Account is 
extending Object-State . 

protecting money nat string Id. Bank Id. Holder interest . 

sort Id. Account Account 

sort DPEN-AC WITHDRW DEPOSIT INTRS . 

subsort Id. Account < Old . 

(* the Account object state declaration *) 

op _\No : _,bk : _,Hd : _,bal : _,Lmt : _,Ints : _ : Id. Account nat Id. bank 
Id. Holder money money Interest— >■ Account . 

(* Messages declaration *) 

op OpenAc : Id. Account Id. Bank nat string— OPEN- AC (* open a new account *) • 
op Wdw : Id. Account money — >■ WITHDWR . (• »ithdra» a given s™ ♦) 

op Dep : Id. Account money — >■ DEPOSIT . (* deposit a given sum •) 

op IncI : Id. Account interest — >■ INTRS . {* increase the interest *) 

vars H : Id. Customer . 
vars C : Id. Account . 
vars W, D , L : money . 

vars I , NI : Interest . (* these variables will be used in the associated net ») 

endo . 



Template and Class Specification. On the basis of the template signature, 
we define the notion of template specification as a Co-net and the notion of 
class as a marked Co-net. Given template signature, the associated Co-net 
structure, can informally be described as follows: 

— The places of the Co-net are precisely defined by associating with each 
message generator one place that we called message place. Henceforth, each 
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message place is containing message instances of a specific form which are 
addressed to object states — and not yet performed. In addition to these 
message places, we associate with the object sort one object place that has 
to contain the current object states of this class. We also note that places 
associated with external messages will be drawn with bold circles. 

— The Co-net transitions reflect the effect of messages on the object states to 
which they are addressed. Also, we make distinction between local transitions 
that reflect the object states evolution and the external ones modeling the 
interaction between different classes. Input (resp. output) arcs are annotated 
by the input (resp. created) tokens. Both arc inscriptions are defined as 
multisets of terms respecting the type of their input and/or output places — 
the associated union operation is denoted by ©. 

— Conditions may be associated with transitions. They involve attribute 
and/or message parameter variables. 



Example 2. By applying these translating ideas to the account signature, we 
obtain the Co-NET depicted in Figure 1. In this net, the four message places 
correspond to the four message declaration, and the object place allows to cap- 
ture the Account object instances. Four transitions reflecting the behaviour of 
these messages are conceived. 



Remark 1. It is worth mentioning that in each transition, the input as well as 
the output arcs are inscribed just by the relevant part of the invoked object 
state(s). For instance, in the DEP(OSIT) transition only the attribute bahnce 
is invoked (i.e. {C\bal : B) in the input arc and {C\hal ■. B + D) in the output 
arcs). This constitutes the key ideas for a fully exhibition of the intra- (and inter-) 
object concurrency. As example, the interest increase and the deposit messages 
may be performed in parallel for a same account by appropriately splitting its 
state. 



2.2 Co-NETS: Semantical Aspects 

After highlighting how Co-nets templates are constructed, we now focus on the 
behavioural aspects of such classes. That is, how to construct a eoherent object 
society as a community of object states and message instances, and how such a 
society evolves correctly. For this aim, we present the state evolution schema to 
be respected and its corresponding semantics. 



Evolution of Object States in Classes. For the evolution of object states in 
a given class, we propose a general pattern that has to be respected in order 
to ensure the encapsulation property — in the sense that no object states or 
messages of other classes participate in this communication — as well as the 
preservation of the object identity uniqueness. Following such guidelines and 
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Behavioural Aspects of the ACCOUNT class 




Fig. 1. The Co-nets Account Specification 



in order to exhibit a maximal concurrency, this evolution schema is depicted 
in Figure 2, and it can intuitively be explained as follows: The contact of just 
the relevant parts of some object states of a given Cl, — namely {I\\attrsi) 
{Ik\attrsk ) — with some messages msn, msip, msji, msjq — declared as local 
or imported in this class — and under some conditions on the invoked attributes 
and message parameters results in the following effects: 

~ The messages msn, ..,msip,msji, ..,msiq vanish; 

— The state change of some (parts of ) object states participating in the commu- 
nication, namely Such change is symbolized by attrs'gi, ..,attrs'g^ 

instead of attrssi, attrsst- 

— Deletion of some objects by explicitly sending delete messages for such ob- 
jects. 

— New messages are sent to objects of Cl namely ms'ji, ms'j^. 




Fig. 2. The Intra-Component Evolution Pattern 
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Rewriting Rules Governing the Co-nets Behaviour. We propose that 
each Co-NET transition is captured by an appropriate rewriting rule interpreted 
in rewrite logic. Following the intra-component evolution pattern depicted in 
Figure 2, the rewrite rules that we associate with it takes the following form: 

T : (Msi^,msi^) (g) .. 0 {Msi^,msi^) (g) {Msj^,msjj (g) .. (g) {Msj^,msjj (g) 
{obj, {Ii\attrsi) © .. © {Ik\attrsk)) 



{Mshi,ms'^^) © ... © {M 

{Msj^,mSj^) © .. © © {obj, {Is^lattrs^^) © .. © {Ist\attrs'^^) © 

(/ij \attrs{^) © .. © {Ii^\attrs{J) 
if Conditions and M{Adci) = 0 and M{Dlci) = 0 

Remark 2. The operator © is defined as a multiset union and allows for relating 
different places identifiers with their current marking. Moreover, we assume that 
© is distributive over © i.e. {p,mti © mt 2 ) = {p,mt\) © {p,mt 2 ) with mti, mt 2 
multiset of terms over © and p a place identifier. The condition M{Adci) = 0 
and M{Dlci) = 0 means that the creation and the deletion of objects have 
to be performed at first. In other words, before performing the above rewrite 
rule, the markings in the Adci as well as in the Dlci places have to be empty. 
Finally, please note that the selection of just the invoked parts of object states, in 
this evolution pattern, is quite possible because of the splitting /recombination 
axiom. This axiom has to be performed before and in accordance with each 
invoked state evolution. 

Example 3. By applying this general form of rewrite rules, it is not difficult to 
generate the rules governing the Account class. 

OPENC: (OP_ ,Open {B,Hid, N))®{ NT ID, et ) 

NT, B N\No: N, Bnk : B, Hd : H, Bal : 0, Lmt : 0, Ints : 0 ) 

©( NT ID, et B N)ifNot{B N et ) 

WDR: {WDR, Wdr{ , IF)) © ( NT, \Bal : B, Lmt : L ) 

^ ( NT, \Bal -.B-W, Lmt : L ) if {W > 0) {B - W) > L 

DEP: {DEP,Dep{ ,D))©( NT, \Bal : B ) 

NT, \Bal ■. B + D )if {D > 0) 

INTR: {INTR,In I{ ,NI)) © ( NT, \Ints : I ) 

^ ( NT, \Int :NI)if {NI > I) 

2.3 Co-NETS: More Advanced Constructions 

So far, we have only presented how the Co-nets approach allows to conceive 
independent classes. In what follows, we show how more complex systems can 
be constructed using advanced abstraction mechanisms, especially inheritance 
and interaction between classes. However, due to the focus of this paper only 
general ideas about simple inheritance and interaction are given. 
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Simple Inheritance. Giving a (super) class Cl modeled as a CO-Net, for 
constructing a subclass that inherits the structure as well as the behaviour of 
the superclass Cl and exhibits new behaviour involving additional attributes, we 
propose the following straightforward conceptualization: 

— Define the structure of the new subclass by introducing the new attribu- 
tes and messages. Structurally, the new attribute identifiers and message 
generators are described using the extending primitive in the OBJ notation. 

— As object place for the subclass, we use the same object place of the super- 
class. This means that such place should now contain now the object states 
of the superclass as well as the object states of the subclass. This is seman- 
tically sound because the sort of this object place is a supersort for objects 
including more attributes. 

— As previously described, the proper behaviour of the subclass is constructed 
by associating with each new message a corresponding place and constructing 
its behaviour (i.e. transitions) with respect to the communication model of 
Figure 3 under the condition that at least one of the additional attributes 
has to be involved in such transitions. 

Remark 3. Such conceptualization is only possible because of the splitting / 
recombination operation. Indeed this axiom permits to consider an object state 
of a subclass, denoted, for instance, as {Id\attrs, attrs') with attrs' the additional 
attributes (i.e. those proper to the subclass), to be also an object state of the 
superclass (i.e. {Id\attrs)). Obviously, this allows a systematic inheritance of the 
structure as well as the behaviour. The dynamic binding with polymorphism are 
systematically taken into account in this modeling. Indeed, when a message is 
sent to a hierarchy of classes we can only know after the firing of the associated 
transition to which class in the hierarchy the concerned object have been sent. 

Interaction between classes. To conceive such interaction, on the one hand, 
we have to take into account the fact that object states evolution within clas- 
ses is ensured by the intra-component pattern as depicted in Figure 2. On the 
other side, we have to respect the encapsulation property which stipulates that 
internal features of objects have to be hidden from the outside. Following these 
guidelines, the inter-component interaction we propose is depicted in Figure 
3. It may be made explicit as follows: The contact of some external parts of 
some object states namely {Ii\attrs-obi ) , ..., {Ik\attrs-obk) , which may belong to 
different classes namely Ci, ...,Cm, with some external messages msn, ..,msip, 
msji,..,msjq defined in such classes and under some conditions on attribute 
values and parameter messages results in the following: 

— The messages msn, ..,msip,msji, ..,msiq vanish. 

— The state change of some (external parts of) object states participating 

in the communication, namely Such a change is symbolized by 

attrs'gi, .., attrs'gf instead of attrssi , .., attrsst- The other participating parts 
of object states remain unchanged (i.e. deletion or creation of part of states 
is not allowed). 
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— New external messages (that may involve deletion/creation ones) are sent to 
objects of different classes, namely ..,ms'jq. 




Fig. 3. The Inter-Component Interaction pattern. 



3 Co-nets Object Evolution 

As pointed out in the introduction, for dealing with the behaviour runtime mo- 
dification we propose to introduce new syntactical constructions, namely meta- 
places and meta-transitions and endowed them with a sound semantics expressed 
by a new inference rule. For achieving this goal, we first introduce the main cha- 
racteristics of these syntactical constructions. Second, we present the semantics 
counterpart of these constructions. Third, we illustrate this new way of handling 
behaviour modification using a running example. 

3.1 Meta-Places and Transitions in Co-nets 

For capturing the dynamic modification, we propose to associate with each com- 
ponent which is subject to future modification a meta-object level^. As depicted 
in Figure 4, the meta-object level associated with a given component is composed 
of a meta-place and three transitions with corresponding messages for adding, 
removing, or updating existing behaviour, respectively. Moreover, at the object 
level there is now two forms of transitions: usual transitions capturing the rigid 
behaviour — respecting the afore described intra-component evolution pattern — 
as represented in the left hand-side and non-instantiated or meta-transitions that 
are directly related to the meta-place through an appropriate read-only^ arc. 

® However, although it is also possible to associate another meta-object level for de- 
aling with modihcation during the cooperation of such components, for sake of un- 
derstandability of this hrst step towards runtime modification in Co-nets we will 
restrict ourselves to the level of components. 

Note that in read-only arc usually represented by arrows there is only enabled con- 
ditions reflecting the presence of such tokens in the corresponding place (i.e. there 
is no deletion or creation of tokens). 
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The Meta-object Level Gouverning the Modified Behaviour 




metal : ii\(obj , {I ^ \ attr s ® .. ® {I \ attr s ® (M es , mes ® , 

(obj, (Ji\attrs'j^) © .. © (J^|affrs^ )) © condition f.) 



g /Add-meta-Token : Add.Bh{T, (P^ , IC-^) © .. © (P^ , IC^), (Qi , CT^ ) © - - © (Q^ , CT^), TC) 

< exist-object : : i\IC,CT,TC) 

g I new-version: : i\IC, CT, TC) © : i + l|/c', CT', TC') 

new-behaviour : : i|/C, CT, TC) © {T : l|/c', CT' , TC') 

c condl : T T 1 
2 _cond2 : T = T1 



[jjrdel: Del.Bh(T, i) 

QYdl-object : {T : i\(P[, /c') © .. © (Pj^ , (q'^, CT') © . 



© {q' , CT! ), TC') 



modif: Chg_Bh(T, i, (P' 7C' ) © .. © (P' , 7C' ) , (q' CT' ) ® ® (q' , CT! ),TC') 

1 1 J\ J\ 1 1 AC K 



g( md-object: (T : ,\(p[, jc') ® ® (pj^ , (q'^ , Ct' ) ® 

E L to-md-object ; (T : .|(Pi, JC,) ® .. ® (P-, JC ■), (Q,. CT, ) ® 



^ » (Qk.CT^).TC) 



selected-meta-Token : {T 

\ , , I 



READ-ARC TOKEN 



i\(obj,IC^^^) © 7 C,l) © ... © 

(obj, CT^bj) ® (Ms,1,CT,i) © ... © (Ms^^ , CTj-g),TC) 



Object-Token : {7i|affrsi)©...©{7^|affrs^) 

modified-object-tokens : (7s^ laftrs'^^) © .. © {Ig^ \attrs'^^) © {7^^ |affrs'^) © .. © {7^^ |affrs'^) 



Fig. 4. The Meta General Forms 



In contrast to the intra-component evolution pattern for usual tran- 
sitions, all tokens annotating input as well as output arcs in this 
meta-component evolution pattern (as depicted in the right hand) 
are just specific variables. In other words, all arc inscriptions, na- 
mely ICobj,CTobj,IC,, , , ..IC,^,CT,, , ..CT,^,CTj , , ..CT,, should be 
declared as variables with compatible sorts according to their corresponding 
input / output places. The meta-place is the main component in this meta- 
construction. It has to contain, as tokens, the different behaviour of meta- 
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transitions. Following the form depicted in Figure 4, each token in the meta-place 
respects the following form: 



(T : i\{obj,ICobj) ® {Msn,ICii) (g) ... (g) {obj, CTobj) ® 

CT,i) 0 ... 0 {Msj„CTj^),TC) 

where: 

— T stands for the name or the label of the transition which has to exist in the 
object level as a non instantiated transition, i stands for a particular version 
of the behaviour associated with this transition T. This particularly allows 
to associate more than one behaviour with a given transition and thereby to 
keep trace of the different change evolution and its strategy. 

— The component {obj, ICobj) <g> {Msii,ICn) (g) ... (g) {Msj^,ICj^) obviously 
defines the different input places with their corresponding (multiset of terms 
as) arc inscriptions. In contrast to the terms in object places, the terms above 
are not ground terms; rather they contain variables exactly like inscriptions 
associated with usual transitions. 

— The second component {obj, CTobj) (g) {Msii,CTn) (g) ... (g) {Msj^,CTj^) cap- 
tures the different output places and associated arc inscriptions. 

— Finally, the component TC reflects the condition that may be associated 
with a given transition. 

For precisely defining this form of meta-tokens using an OBJ notation, first we 
present a correponding OBJ notation of the notion of Co-nets state so far used 
more informally. The two constraint sorts ensure that the marking (terms) sort 
is compatible with its corresponding place. 

obj Co-NETS-State is 
extending class-description . 
subsort obj Mesa ••• Mesee < Place, 
subsort Place < Place_Marking . 
sort Place_Identif ier . 

op obj Mesa • • . MeSee : ^ Place_Identif ier . 

op _ Place Place-Marking — >■ Place-Marking, [assoc, comm, id:] 

op (-,-): Place-Identifier Place-Marking — >■ Marking. 

op _ (g) _ : Marking Marking — >■ Marking . 

vars p : Place_Identif ier 

vars T, Tl, T2 : Place_Marking 

scr {p,T) : Undefined if p=obj T:{ Msa,..,Msee} . 
scr {p,T) : Undefined if p {Msa , . . ,Msee} T : ob j . 
eq (p,Ti Ta) = (p,Ti) (g) (p,T 2 ) . 

endo . 

obj meta-CO-NETS-State is 
extending Co-NETS-State . 

sort meta-place . Id Transition. Id Meta-Marking, 
subsort Input-Tokens Created-Tokens < Component . 
subsort Meta-Token < Meta-Tokens . 
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subsort Condition < Boolean . 
op [_] : Marking — >■ Component . 

op : Transition. Id Nat Input-Tokens Created-Tokens 

Condition — >■ meta-Token . 

op _ Meta-Token Meta-Tokens — >■ Meta-Tokens [assoc, comm, id:] 

op Meta-place . Id Meta-Tokens — > Meta-Marking, 

op _(g)_: Meta-Marking Meta-Marking — >■ Meta-Marking, 
vars T1 , T2 : Meta-Tokens . 
vars p : Meta-place . Id. 
eq (p,Ti T 2 ) = {p,Ti) (p,T 2 ) . 
endo . 

Finally, the three transitions at the meta-level, namely Add-beh, Del-beh, and 
ChgJbeh, allow to introduce an additional behaviour (i.e., a new version for an 
already existing transition behaviour), to delete an existing behaviour, or to 
modify an exisiting behaviour (i.e. replace the input tokens and/or the output 
tokens and/or the transition condition). The associated rewrite rules for these 
three transitions can straightforwardly be generated in a same way as at the 
object level. 



3.2 The Co-NETS-meta level semantics 

For propagating a given behaviour from the meta-object level to the object 
level using non instantiated transitions, we have to fully exploit the read-only 
arc relating the two-levels. For that, we propose a two-steps approach. First, 
we instantiate the selected transition by the appropriate selected behaviour (as 
tokens from the meta-place). Second, we use this instantiated rewrite rule as a 
usual rewrite rule. Henceforth, the crucial step is how to achieve the instantiation 
in such way that we generate nothing but a usual rewrite rule that respects the 
intra-component evolution pattern depicted in Figure 2. 

Before going into technical detail and its complex notations, let us explain 
the main ideas of this process of (meta-rule) instantiation and the meta-inference 
rule that we associate with in a more abstract but yet simplified way. Due to 
the presence of the read-only arc relating the two levels, the rewrite rule that 
may be associated with non-instantiated (or meta-) transition can be abstracted 
away as T : / ® m => r 0 ■ The terms I and r represent input tokens (as 

variables) with their associated places and the output tokens (as variables) 
with their places. The term m stands for the general form of tokens which can 
be selected from the meta-place. Remark that for appropriately reflecting he 
read-arc semantics, m must appear in the left as well as in the right without 
any change. Now assume that in the place meta-place there is (at least) a 
meta-token of the form (T : _|..) representing one behaviour of the transition 
T. Obviously, it is quite possible to denote this selected meta-token rather 
by cr(m); where a denote different substitutions of the variables ICi^, CTi. 
in TO by their corresponding terms selected from this particular meta-token 

For sake of simplicity we omit the condition part. 



5 
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(T : Altogether, we have now the following (two-components) premise 

of the inference rule to be proposed : ct(to) and T : ? 0 m r 0 m. As 
conclusion from this premise, we propose the following (instantiated) rewrite 
rule: T : a{l) cr(r). Thus, our abstract-meta-inference rule has the form: 

cr(w) l®m^r®m 

a{l)=^a{r) ^ ' 

After introducing this abstract view, it is not difficult to be more concrete and 
replace the corresponding abstract terms, namely I, r, and m by their concrete 
instantiation. Following the general schema represented in Figure 4, the corre- 
sponding (multiset of) terms associated with I, r, and m are as follows: 

— I stands for the input tokens in the meta-transition T(i) and then have to 
be instantiated by: 

(obj, ICobj) ® (g) ... (g) (Msi^,ICi^) 

(g) ... (g) 

— r stands for the output tokens in the meta-transition T(i) and hence have 
to instantiated by: 

(objjCTobj) (g) {Msh^,CThJ (g) ...(g) 

{Msh^,CTh^) ® 0 ... 0 

— Also we have to add the condition part as TC(t). 

— m stands for the term associated with the read-arc denoted as selected-meta- 
Token and then corresponds to : 

(T : i\{obj, ICobj) <g) (g) ... (g) 

{obj, CTobj) 0 {Ms,„CTiJ 0 ... 0 {Msj„CTj^),TC). 

— a stands for any substitution of the variables ICi., CTi^ and TC by concrete 
terms of a given selected token from the meta-place. For that, we assume 
that the meta-place contains (at least) the following token : 

(T : i\{obj, {Ii\attrsi^) © .. © {Ik\attrsi^))iSi 
{Mesi^^mesi^) © .., {obj, © ..© 

{Jra\attrs'j^)) © .. © {M e s' j me s' j^), Condition). 

In this case, the substitution cr should be defined as a union of the following 
elementary (variable-terms) replacements: 



Variable 


Corresponding Substitution 


ICobj 


{h\attrsij) © .. © {Ik\attrsi^) 


ICi, 


TTieSlf^ with G {ii, ..ip, ji, jp} 


CTobj 


(Ji attrs', ) © .. © {Jk\attrs'jJ 


CT,,, 


mesu 

^k' 



These tedious (abstract) instantiations become more simple when applied to 
concrete examples. 

3.3 The Account Evolving Specification 

The purpose of this subsection is a concrete illustration of the proposed ideas for 
coping with the dynamic evolution in distributed information systems specified 
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as Co-NETS components. Using the already specified Account specification, 
we now assume that only the Deposit operation is rigid. This means that the 
Withdraw as well as Interest-Increase are conceived (initially) to be logically 
subject to change during the life cycle of an account. 

Following our conceptualization, while the Deposit behaviour (i.e., its cor- 
responding transition) remains unchanged as usual transition, transitions as- 
sociated with Withdraw and Incintst have to be conceived as non-instantiated 
transitions. This fact is reflected in Figure 5, where a new meta-object level 
is introduced. The meta-place meta-account is directly related to the two non- 
instantiated transitions using read-only arcs. The terms associated with these 
two arcs are 

{WDR : i\{ACNT,ICwl) ® {WDR,ICw2), {AC NT, CTw),TCw) 
for Withdraw and 

{INTR : i\{ACNT, ICil) (g) {INTR, ICi2), {ACNT, CTi),TCi) 

for Deposit. The non-instantiated transitions have been derived from the two 
usual transitions associated with Withdraw and Deposit in two phases. First, we 
have replaced the instantiated input arcs as well as output arcs associated with 
these transitions by corresponding variables, namely ICwl, ICw2, CTw, and 
TCw for the WDR transition and ICil, ICi2, CTi, and TCi for the INTR 
transition. Second, the corresponding initial behaviour of these two operations 
is reported as tokens in the meta-place meta-account. More precisely, the token 
reflecting the behaviour of W ithdraw is: 

{WDR : 1\{ACNT, {C\Bal : B, Lmt : L)) ® {WDR, Wdr{C, W)), 

{ACNT, {C\Bal : B-W, Lmt : L)),{W > 0 A {B -W) > L). 

And the token reflecting the behaviour of Deposit is: 

{INTR : l\{ACNT,{C\Ints : /)) (g) {INTR,IncI{AN ,NI)), 

{ACNT, {C\Ints : NI)), {NI > I)). 

Besides this behaviour, we now assume that managers of this application decide 
to add new version of withdraw, and to completely change the current behaviour 
of the Increase-of-interest. For the supplement withdraw behviour they propose, 
for instance, that only a sum that is less than 2 percent of the current balance 
may be withdrawn, and there is some tax (as a natural constant tax) to be 
deduced from the balance (in addition the to withdrawn sum). In the modified 
form of interest-increase, they require that only accounts with balance more than 
3000 are authorized to increase their interest with no more than 1 percent. 

This runtime introduction of a new withdraw behaviour as well as the mo- 
dification of the interest-increase method are are reflected in the corresponding 
places Add-Beh and Chg.Bh, in Figure 5, by the presence of the following 
terms: 
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ABBREVIATION : 

modify: Chg _B h(T , i, , . .) 

obj-to-modify: (T : i|_, 
obj-modified: (T : 1 1 (P' , 7C' ). . . ) 

new-version: (T^ : t|/C, CT, TC) ® (T^ : t + l\ic' , CT' , TC') 

Input-add: Add^h (T, i, - - -) 

add-BehfWRD) : Add^eh(WDR, (ACNT, {C\Bal : B, Lmt : L)) ® {WDR, Wdr{C, W)), 

{ACNT, {C\Bai : B - tax - W, Lmt : L)), {W > 0) A ((B - W) > L) A {W < .02 * B)) 
Chg-Beh(lNTR); Chg.Beh(I NT, (ACNT, (C\Ints : IBal : B))® 

(INTR, IncI(C, NI)), (ACNT, {C\Ints : NI , Bal : B)), 

(NI > 7) A (AT7 < 0.01) A (B > 3000)) 



Fig. 5. A Dynamically Evolving Account Specification 



dd_Beh(WDR,( NT, \Bal : B, Lmt : L ) (g> {WDR,Wdr{ ,W)), 

( NT, \Bal : B -tax-W,Lmt: L ),(W >0) ({B-W)>L) (W<02*B)) 

h _Beh(INT, { NT, \Ints : I, Bal : B ) (g> (INTR, In /( , NI)), 

( NT, \Ints : NI, Bal : B ), (NI > I) (NI < 0 01) (B > 3000)) 



The firing of the meta-transitions ADD and MODIF result in the Co-net 
depicted in Figure 6; where the new version of the withdraw behaviour is added 
and the increase- of -interest behaviour is modified. 





108 



N. Aoumeur 



Let us, for instance, explain how the meta-infernce-rule is applied for gene- 
rating this new (additional) behaviour for Withdraw.^ First, we have to apply 
the (abstract) meta inference-rule (**) with the following replacements: 

— I stands for the terms (as variable) with their respective place annot- 
ating input arcs in the meta-transition WDR{i); hence it is equal to 
{ACNT, ICwl) (g) {WDR, ICw2); 

— r stands for the terms (as variable) with their respective place annot- 
ating output arcs in the meta-transition WDR{i); hence it is equal to 
{ACNT, CTwl) ® {WDR, CTw2); 

— m stands for the read-arc relating this transition with the meta place meta- 
account, i.e., it is equal to: 

{WDR : i\{ACNT, ICwl) ® {WDR, ICw2), {ACNT, CTw), TCw); 

— the condition C stands for TCw; 

— the substitution cr is the union of the following elementary substitution: 



Variable 


Corresponding Substitution 


ICwl 


{C\Bal : B,Lmt : L) 


ICw2 


Wdr{C, W) 


Ctw 


{C\Bal : B — tax — W, Lmt : L) 


TCw 


{W > H L{{B -W) > L) L{W < .02 * B)) 



— The conclusion corresponds to the following (usual) Co-nets rewrite rule 
that can be applied exactly in the same way as done for rules of the object 
level: 

WRD(2’’): ( NT, \Bal : B, Lmt : L ) (g> {W DR,Wdr{ ,W)) 

=> ( NT, \Bal : B — tax — W, Lmt : L ) if 
{W>0 {{B-W)>L) {W<Q2*B) 



4 Conclusion 

The Co-nets approach is an adequate conceptual model that conceive infoma- 
tion systems as fully distributed, autonomous yet cooperative components. It is 
based on a complete integration of 00 concepts and constructions into a va- 
riant of algebraic Petri nets, and it has the following features. First, Co-nets 
allow for modeling components as hierarchy of classes through different forms 
of inheritance, object composition and aggregation. Second, interaction of such 
components is achieved using explicit interfaces. Third, Co-nets semantics is 
expressed in rewriting logic allowing rapid-prototyping using rewriting techni- 
ques 

In this paper we have proposed an appropriate extension of the approach for 
dynamically evolving different system components in a formal way. The main 



° Note the choice of a given withdraw method is completely let to the user. 
^ To say that we now have two ithdra behaviours. 
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ABBREVIATION : 

modify: C hg ^Bh(T , i, , . .) 

obj-to-modify: (T : i|_, 

obj -modified: (T 

new-version: : tjlC, CT, TC) ® {T^ 

addl: Add.B h(T , t, 



i + 1\IC' , CT' , TC') 



Fig. 6. The Meta-Account Objects after firing the meta-level transitions 



ideas under this Co-NETS extension are the use of new forms of places and transi- 
tions named respectively meta-places and meta-transitions and the introduction 
of an adequate inference rule that propagate the behaviour to usual Co-net 
rewrite rules. This new form of evolving specification have been illustrated using 
a simplified account specification. 

However, after this crucial first step towards specifying and rapid-prototyping 
advanced information systems are distributed, dynamically evolving, autono- 
mous yet cooperating components using our Co-nets approach, we are cons- 
cious that much work remain ahead. Thus, our future investigation will focus 
on the consolidation of this dynamic evolving by drawing up more complex case 
studies. Furthermore, as a result of this formalization of dynamic evolution, two 
interesting directions are to be investigated. On the one hand, we plan to extend 
the meta-place content to deal not only with behaviour but also with object and 
message instances. This will particularly allow us to include and enforce dyna- 
mic as well as static constraints integrity (on the whole component instances). 
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On the other hand, we also plan to include in these meta-place proof-algebras 
[7] which keep trace of all fired transitions in the object level using an adequate 
partial ordering. This will permit us to incorporate a past temporal reasoning 
as a formal verification in our approach. 
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Abstract In this article we extend previous work on the development 
of logical foundations for the specification of the dynamics of databases. 
In particular, we deal with two problems. Firstly, the derivation of ac- 
tive rules that maintain the consistency of the database by triggering 
repairing actions. Secondly, we deal with the correct integration of the 
specification of the derived rules into the original specification of the 
database dynamics. In particular, we show that the expected results are 
achieved. For instance, the derived axiomatization includes, at the object 
level, the specification that repairing action executions must be enforced 
whenever necessary. 



1 Introduction 

In this article we propose a logic based approach to automatically derive active 
rules [30] for the maintenance of the integrity of a database [5]. This research 
follows the tradition of specifying the semantics of a database using mathematical 
logic [20,7]. In particular, we deal with a logical framework in which transactions 
are treated as objects within the logical language, allowing one to reason about 
the dynamics of change in a database as transactions are executed. 

As shown in [26] , it is possible to specify the dynamics of a relational database 
with a special formalism written in the situation calculus (SC) [17], a language 
of many-sorted predicate logic for representing knowledge and reasoning about 
actions and change. Apart from providing a natural and well studied semantics, 
the formalism can be used to solve different reasoning tasks. For instance, reason 
about the evolution of a database [3], reason about the hypothetical evolutions 
of a database [1], reason about the dynamics of views [2], etc. 

In the SC formalism, each relational table is represented by a table predicate. 
Each table predicate has one situation argument which is used to denote the 
state of the database. In order to specify the dynamics of the relations in a 
database, one derives the so-called successor state axioms (SSAs); one SSA per 
base relation or table. 

* This research has been partially financed by FONDFCYT (Grants 1990089 and 
1980945), and FCOS/CONICYT (Grant C97F05). 
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Each SSA describes the exact conditions under which the presence of an 
arbitrary tuple in the table holds after executing an arbitrary legal primitive 
transaction^. SSAs state necessary and sufficient conditions for any tuple to 
belong to a relation after a transaction is performed. These conditions refer only 
to the state^ in which the transaction is executed and does not make reference 
to further constraints on the resulting state. Thus, there are no explicit integrity 
constraints (IC) in the specification. 

Given the exhaustive descriptions of the successor states (those obtained after 
executing primitive transactions) provided by the the SSAs, it is very easy to get 
into inconsistencies if integrity constraints are introduced in the specification. 
For example, this is true when ramification constraints are introduced. These 
are constraints that force the database to make indirect changes in tables due 
to changes in other tables. 

There are several options to deal with this problem: 

1. One can assume that the ICs are somehow embedded in the SSAs. This is the 
approach in [26]. They should be logical consequences of the DB specification. 

2. Some ICs can be considered as qualification constraints, that is, they are 
considered as constraints on the executability of the database actions. In 
[14] a methodology for translating these constraints into axioms on the exe- 
cutability of actions, or better, on the legality of actions is presented. In this 
approach, the so-called Action Precondition Axioms are generated. 

3. For some interesting syntactical classes of ramification ICs, there are mech- 
anisms for compiling them into Effect Axioms, from which the SSAs can be 
re-computed. Then, the explicit ICs disappear and they turn out to be logical 
consequences of the new specification [18,23] (see also [3] for implementation 
issues). 

4. It is also possible to think of a database maintenance approach, consisting 
of adding active rules to a modification of the original specification. These 
rules enforce the satisfaction of the ICs by triggering appropriate auxiliary 
actions. Preliminary work on this, in a general framework for knowledge 
representation of action and change, is shown in [22]. 

The last alternative is the subject of this paper. There are several issues to be 
considered. First, a computational mechanism should be provided for deriving 
active rules and repairing actions from the ICs. Second, the active rules should 
be consistent with the rest of the specification and produce the expected effects. 
Third, since the active rules will have the usual Event-Condition- Action (EC A) 
form [30,31], which does not have a direct predicate logic semantics, they should 
be specifiable in (a suitable extension of) the language of the situation calculus, 
and integrated smoothly with the rest of the specification. Some work in this 
direction, on the assumption that general ECA rules are given, is presented in 

[ 4 ]^ 

^ These are the simplest, non-decomposable transactions; they are domain dependent. 
In the KR literature they are called “actions”. We consider the notions primitive 
transaction and action as synonyms. 

^ In this paper we do not make any distinction between situations and states. 
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2 Specification of the Database Dynamics 

Characteristic ingredients of a particular language L of the situation calculus, 
besides the usual symbols of predicate logic, are: 

(a) Among others, the sorts action, situation. 

(b) Predicate symbols whose last argument is of the sort situation. These pred- 
icates depend on the state of the world and can be thought of as the tables 
in a relational database. 

(c) Operation symbols which applied to individuals produce actions (or primitive 
transactions), for example, enroll{-) may be an operation, and enroll{john) 
becomes an action term. 

(d) A constant, S'o, to denote the initial state of the database. 

(e) An operation symbol do that takes an action and a situation as arguments, 
producing a new situation, a successor situation resulting from the execution 
of the action at the given situation. 

In these languages there are first-order variables for individuals of each sort, so it 
is possible to quantify over individuals, actions, and situations. They are usually 
denoted by Vx, Va, Vs, respectively. 

The specification of a dynamically changing world, by means of an appropri- 
ate language of the situation calculus, consists of a specification of the laws of 
evolution of the world. This is typically done by specifying: 

1. Fixed, state independent, but domain dependent knowledge about the indi- 
viduals of the world. 

2. Knowledge about the state of the world at the initial situation given in terms 
of formulas that do not mention any state besides S'o. 

3. Preconditions for performing the different actions (or making their execu- 
tion possible). The predicate Poss is introduced in C. The predicate has 
one action and one situation as arguments. Thus, Poss{a,s) says that the 
execution of action a is possible in state s. 

4. The immediate (positive or negative) effects of actions in terms of the tables 
whose truth values we know are changed by their execution. 

In Reiter’s formalism, the knowledge contained in items 1. and 2. above is con- 
sidered the initial database Sq. The information given in item 3. is formalized 
by means of action precondition axioms (APAs) of the form: 

Poss{A{x), s) = s), 

for each action name A, where 7 T/i(x, s) is a SC formula that is simple in s. 
A situation is said to be simple in a situation term s if it contains no state 
term other than s (e.g., no do symbol); no quantifications on states; and no 
occurrences of the Poss predicate [15]. Finally, item 4. is expressed by effect 
axioms for pairs (primitive transaction, table): 
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Positive Effects Axioms: For some pairs formed by a table R and an action name 
A, an axiom of the form: 

'^{x,y,s)[Poss{A{y),s) Aif'^{y,x,s) D R{x,do{A{y),s))]. (1) 

Intuitively, if the named primitive transaction A is possible, and the precondi- 
tions on the database are true at state s (in particular, on the table R, repre- 
sented by the meta-formula <f~^{y, x, s)) then the statement R becomes true of x 
at the successor state do{A{y),s) obtained after execution of A at state s. Here, 
X, y are parameters for the table and action. Notice that in general we have two 
kinds of conditions: (a) Preconditions for action executions, independently from 
any table they might affect. These are axiomatized by the Poss predicate, (b) 
Preconditions on the database for pairs table/action which make the changes 
possible (given that the action is already possible). These preconditions are rep- 
resented by ip'^iy, X, s). 

Negative Effects Axioms: For some pairs formed by a table R and an action 
name A, an axiom of the form: 

V(x,y,s)[Poss(H(y),s) A(fif^{y,x,s) D ~^R{x, do{A{y), s))]. (2) 

This is the case where action A makes table R to become false of x in the 
successor state. 

Example 1. Consider an educational database as in [26], with the following in- 
gredients. Tables: 1. Enrolled{stu, c, s), student stu is enrolled in course c in 
the state s. 2. Grade {stu, c, g, s), the grade of student stu in course c is g in 
the state s. Primitive Transactions: 1. register {stu, c), register student stu in 
course c. 2. change{stu,c, g), change the grade of student stu in course c to g. 
3. drop{stu,c), eliminate student stu from the course c. 

Action Precondition Axioms: 

\/{stu,c,s)[Poss{register{stu,c),s) = ~<Enrolled{stu,c, s)]. 

\/{stu, c, g, s)[Poss{change{stu, c, g),s) = 3g' Grade{stu, c, g' , s)j. 

\/{stu,c, s)[Poss {drop {stu, c), s) = Enrolled{stu,c, s)]. 

Effect Axioms: 

\/{stu,c, s)[P OSS {register {stu, c), s) D Enrolled {stu, c, do{register{stu, c) , s))] 
\/{stu,c, s)[Poss{drop{stu, c), s) D -'Enrolled {stu, c, do{drop{stu, c), s))j 
\/{stu,c, g, s) [Poss{change{stu, c, g),s) D 

Grade{stu, c, g, do{change{stu, c, g), s))j. 



\/{stu,c, g, g' , s) [Poss{change{stu, c, g') , s) A g g' D 

~'Grade{stu, c, g, do{change{stu, c, g'), s))j 

□ 
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A problem with a specification like the one we have so far is that it does not 
mention the usually many things (entries in tables) that do not change when 
a specific action is executed. We face the so-called frame problem, consisting 
of providing a short, succinct, specification of the properties that persist after 
actions are performed. Reiter [25] discovered a simple solution to the frame 
problem as it appears in the situation calculus. It allows to construct a first-order 
specification, that accounts both for effects and non-effects, from a specification 
that contains descriptions of effects only, as in the example above. We sketch 
this solution in the rest of this section. 

For illustration, assume that we have no negative effects, and two positive 
effect laws for table R: (1) and 

y{x, z, s)[Poss(A'(z), s) A if^{z, X, s) D R(x, do{A'(z), s))]. (3) 

We may combine them into one general positive effect axiom for table R\ 

'i{a,x,s)[Poss{a,s) A [3y{a = A{y) A (pj^{y,x,s)) V 

3z(a = A' {z) A x, s))j D R{x, do{a, s))j. 

In this form we obtain, for each table R, a general positive effect law of the form: 

V(a, X, s)[Poss{a, s) A 7^ (a, x, s) D R{x, do{a, s))j. 

Analogously, we obtain, for each table R, a general negative effect axiom: 

V(o, X, s)[Poss{a, s) A 7^(o, x, s) D ~<R{x, do{a, s))j. 

For each table R we have represented, in one single axiom, all the actions and 
the corresponding conditions on the database that can make R{x) true at an 
arbitrary successor state obtained by executing a legal action. In the same way 
we can describe when R{x) becomes false. 

Example 2. (cont’d) In the educational example we obtain the following general 
effect axioms for the table Grade: 

V(a, stu, c, g, s) [Pass (a, s) A a = change{stu, c, g) D Grade{stu, c, g, do{a, s))j 
V(a, stu, c, g, s) [Poss(a, s) A 3g'{a = change{stu, c, g') /\ g ^ g') 

D -iGrade{stu, c, g,do{a, s))]. 



□ 

The basic assumption underlying Reiter’s solution to the frame problem is that 
the general effect axioms, both positive and negative, for a given table R, contain 
all the possibilities for table R to change its truth value from a state to a successor 
state. Actually, for each table R we generate its Successor State Axiom: 

V(a, s)Poss{a, s) D \/x[R{x, do{a, s)) = (7)!^ (a, S, s) V {R{x, s) A {a, a;, s)))j. 

( 4 ) 
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Here, and 7“ are of the form Vsome A's ^ ‘P(u,x,s)), meaning 

that action A, under condition (p, makes R(x, do(A, s)) true, in the case of 7''", 
and false, in the case of 7“ . Thus, the SSA says that if action a is possible, then 
i? becomes true at the successor state that results from the execution of action 
a if and only if a is one of the actions causing R to be true (and for which the 
corresponding preconditions, (p, are true) , or R was already true before executing 
a and this action is not one of the actions that falsify R. 

Example 3. (cont’d) In our running example, we obtain the following SSAs for 
the tables in the database: 

'i{a, s)Poss{a, s) D c)[Enrolled{stu, c, do{a, s)) = a = register{stu, c) V 

Enrolled{stu, c,s) A a ^ drop{stu, c)] 

V(a, s) Pass {a, s) D V(stu, c, g)[Grade{stu, c, g, do{a, s)) = a = change{stu, c, g) V 
Grade{stu, c, (/, s) A ~'3g'(a = change{stu, c, g') A g' ^ g)]. 



□ 

Notice that, provided there is complete knowledge about the contents of the 
tables at the initial state, the SSAs completely describe the contents of the 
tables at every state that can be reached by executing a finite sequence of legal 
primitive transactions (that is for which the corresponding Pass conditions are 
satisfied). The SSAs have a nice inductive structure that makes some reasoning 
tasks easy, at least in principle. 

In order for the specification to have the right logical consequences, we will 
assume that the following Foundational Axioms of the Situation Galculus (FAs) 
underlie any database specification [14]: 

1. Unique Names Axioms for Actions ( UNAA): Ai{x) ^ Aj{y), for all different 
action names Ai,Aj; and '^{x,y)[A{x) = A{y) D x = y], for every action 
name A. 

2. Unique Names Axioms for States: 

So 7^ do{a, s), 

do(ai, si) = do{o 2 , S2) D ai = 02 A si = S2- 

3. For some reasoning tasks we need an Induction Axiom on States: 

VP [P{So) A \/sya (P(s) D P{do{a, s))) D Vs P(s)j, 

that has the effect of restricting the domain of situations to the one contain- 
ing the initial situation and the situations that can be obtained by executing 
a finite number of actions. In this way, no non-standard situations may ap- 
pear. The axiom is second order, but for some reasoning tasks, like proving 
integrity constraints, reasoning can be done at the first-order level [14,3]. 

4. Finally, we will be usually interested in reasoning about states that are acces- 
sible from the initial situation by executing a finite sequence of legal actions. 
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This accessibility relation on states, <, can be defined from the induction 
axiom plus the conditions: 

—•s < So 

s < do{a, s') = Poss{a, s') A s < s'. 

Summarizing, a specification S, in the SC, of transaction based database updates 
consists of the sets: Eq U APAs U SSAs U ^14s. 

Example 4- (cont’d) A static IC we would like to see satisfied at every accessible 
state of the database is the functional dependency for table Grade^ . The IC can 
be expressed by: 

Vs(5o < s D Vstu, c, gi,g 2 {Grade{stu, c, t/i, s) A Grade{stu, c, g 2 , s) D gi = 32 )) • 

( 5 ) 

According to Reiter [26] , this formula should be a logical consequence of a correct 
specification of the form described above; actually in our example this is the case. 
Otherwise, we should have to embed the IC into a modified specification of the 
same form as before or we should have to generate active rules for making the 
IC hold. □ 

3 Integrity Constraints and Internal Actions 

In the following, we distinguish between agent or user actions and internal ac- 
tions'^ . We have already considered the first class; they are user defined primitive 
transactions, and appear explicitly in a possibly more complex user transaction®. 
Instead, the internal actions are executed by the database management system 
as a response to the state of the database. They will be executed immediately 
when they are expected to be executed with the purpose of restoring the integrity 
of the database. 

In the rest of this paper, with the purpose of illustrating our approach, we 
will consider only ICs of the form 

'ds{So < s Z:>'dx {Ri{x,s) A R2{x,s) D R{x,s)), (6) 

where R\, R 2 , R are table names, or negations of them; and the variables in 
each of them in (6) are among the variables in x. The Rs could also be built- 
in predicates, like the equality predicate. In particular, as described in a later 
example, functional dependencies, fall in this class. 

A basic assumption here is that an IC like (6) is not just a logical formula 
that has to be made true in every state of the database, but it also has a causal 

® In this paper we consider only static integrity constraints. A methodology for treating 
dynamic integrity constraints as static integrity constraints is presented in [1]. 

In [22] they are called natural actions. 

® Complex database transactions have been treated in [4]. In this paper we restrict 
ourselves to primitive transactions only. 
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intention, in the sense that every time the antecedent becomes true, necessarily 
the consequent has to become true as well, whereas from a pure logical point of 
view, just making the antecedent false would work (see [13,28] for a discussion 
about the logical representation of causal rules.). 

Example 5. (cont’d) The functional dependency (5) has the form (6). Neverthe- 
less, it is not written in a “causal” form. That is, the intention behind the axiom 
is not to state that if two grades g\ and g 2 are recorded for the same student 
in a course at a given situation, then both grades are caused to be the same. 
Actually, it would make no sense to try to enforce this. A more appropriate way 
to write (5) is 

Vs(S'o < s D Vstrt, c, (?i, (?2 {Grade{stu,c,gi,s) /\ gi ^ g 2 D ~<Grade{stu,c, g 2 , s), 

( 7 ) 

that is, a student having a certain grade is the cause for the same student not 
having any other different grade. □ 

For each IC of this form, we introduce an internal action name Ar with as 
many arguments as non situational arguments appear in R. We introduce a new 
predicate. Internal on actions; then, for Ar, we specify 

Internal{Afi{x)), (8) 

Poss{An{x), s) D Ri{x, s) A R 2 {x, s) A -<R{x, s), (9) 

Poss{An{x),s) D R{x, do{A}i{x,s)). (10) 

This says that: (a) the new action is internal; (b) a necessary condition for the 
internal action to be possible is that the corresponding IC is violated; and (c) if 
it possible, then, after the internal action is executed, the R, mentioned in the 
head of (6), becomes true of the tuple x at the successor state. 

Notice that the right-hand side of (9) should be an evaluable or domain inde- 
pendent formula [29,10]. In addition, as mentioned in Example 5, and according 
with the causal view of ICs, the literal R there should not be associated to a 
built-in predicate, because its satisfaction is enforced through formula (10). 

As discussed later on, there may be extra necessary conditions at a state s 
to specify for the execution of An at s. Once all these necessary conditions have 
been collected, they can be placed in a single axiom of the form 

Poss{Ar{x,s)) Dipar{x,s). (11) 

Later, we appeal to Clark’s completion [9] for the possibility predicate for the 
internal action. Thus, transforming necessary conditions into necessary and suf- 
ficient conditions. Thus, replacing D by = in (11). 

In our example, ipAuix^s) would contain the right-hand side of (9) among 
other things, but not the right-hand side of (10) that corresponds to a new effect 
axiom. 

The need for extra necessary conditions for the repairing action An in (9), 
is related to specific repair policies to be adopted. In our running example, we 
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might decide that when a new grade is inserted for a student in a course, then 
the old grade has to be eliminated in favor of the new one. Therefore, adding 
the new grade will be equivalent to performing an update. This should be the 
effect of the internal, repairing action. Therefore, the extra necessary condition 
for this action is that the grade to be eliminated was in the database before the 
new one was introduced. This sort of historical conditions can be handled in our 
formalism by means of virtual tables that record the changes in the tables [4] 
(in our example, the old grade would not be recorded as a change, but the new 
one would; see example 6 below); or by means of a methodology, presented in [1] 
and based on [6], for specifying the dynamics of auxiliary, virtual, and history 
encoding views defined from formulas written in past temporal logic. 

Once the internal action is introduced, in order to produce the effects de- 
scribed in (10), the action Ar has to be inserted in the SSA of the corresponding 
table: if the right-hand side of (6) is a table R, with SSA like (4), then a = 
must now appear in 7)^(0, ir,s) in (4) to make R true. If the right-hand side is 
-■i?, then a = must be appear in 7)^(0, x, s), to make R false. 

Action A/j is specified to make R true. The fact that R is different from i?i 
and i?2, or even when R is the same as, say, i?i, the fact that the arguments to 
which the literals apply are different, will cause that An will not affect the truth 
of i?i(x, s) A R 2 {x, s); it will persist from s to do{An{x), s), making the IC true 
at do{An{x),s), that is. An has a repairing effect. Notice also that the IC is not 
satisfied at the state s, where the internal action An will be forced to occur. In 
this sense, s will be an “unstable” situation. 

Since, later on, we will force internal actions to occur at corresponding un- 
stable situations, we define a stable situation as every situation where no internal 
actions are possible: 

stable(s) = ~<3a {Internal(a) A Poss{a, s)). (12) 

Intuitively, the stable situations are those where the integrity constraints are to 
be satisfied. Instead, unstable situations are a part of a sequence of situations 
leading to a stable situation; in those unstable, intermediate situations, ICs do 
not need to hold. The transition to a stable situation is obtained by the execution 
of auxiliary, repairing actions. 

The specification in (11) suggests the introduction of the following active 
rule: 



Ear{x); {‘^Ar{x)} ^ An{x)] (13) 

where Baj^x) is an Event associated to the IC that corresponds to changes 
produced in the database, for example, the insertion of x in the tables appearing 
in (9). This event causes the rule to be considered. Then, Condition {(/?Ar(^)} 
is evaluated. It includes the fact that R\, R 2 became true and ^R became false. 
If this Condition is satisfied, then the Action An is executed. 

An alternative to rule (13) would be to skip Eaj^ and always check Condition 
{ipAiiix)} after any transaction, but including in the Condition the information 
about the changes produced in the database by keeping them in an auxiliary 
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view, so that they can be taken into account for the Action to be executed. This 
approach is illustrated in the next example. 

Example 6. (cont’d) Given the functional dependency (7), we introduce the in- 
ternal action Aoradeistu, c, g) . Assume that we already have a view 

Changes (stu, c,g,s) 

that records the changes in Grade. That is, Changes c, g, s) means that 
Grade{stu, c,g, s) A ~<Grade{stu, c,g, s'), where s' is the situation that precedes 
s. Then, the precondition axiom for the new action is 



Poss{Aarade{stu, c, g) , s) 



= Grade{stu, c, gi, s) A gi ^ g A Grade{stu,c,g,s) 
A Changes crade{stu,c,gi,s). 



(14) 



The new action should have the effect of deleting tuple {stu,c,g) from Grade. 
This is specified with the effect axiom: 



Poss{Acrade{stu, c, g) , s) D ~<Grade{stu, c, g, do{An{stu, c, g), s)), (15) 



that corresponds to (10). 

Thus, when the IC is violated (this violation is expressed by the first three 
conjuncts on the right side of (14)), the action Aorade is possible. When the in- 
ternal action becomes possible, it must be executed. As a result of the execution, 
the repair is carried out, by eliminating the old grade. 

Notice that predicate Changes could be pushed to the Event part of the 
rule, as discussed before, because it keeps record of the changes in the database. 
The dynamics of an auxiliary view like Changes could be specified, and in this 
way, integrating everything we need into the same specification, by means of a 
corresponding SSA. This can be achieved by means of a general methodology de- 
veloped in [2] and [1] for deriving SSAs for views and history encoding relations, 
resp. 



4 Specifying Executions 

The approach presented in the previous section is still incomplete. Indeed, for this 
approach to work, we need to address two independent but important problems. 
First, we need to specify that executions should be enforced, since, up to now, 
the formalism presented deals with hypothetical executions. Second, we need to 
deal with the problem of executions of repairing actions arising from several ICs 
violated in the same situation. We deal with these issues below. 

Notice that there is nothing in our SC specification that forces the action 
An to be executed. The whole specification is hypothetical in the sense that if 
the actions ... were executed, then the effects ... would he observed.. Thus, from 
the logical specification of the dynamics of change of a traditional database, 
it is possible to reason about all its possible legal evolutions. However, these 
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specifications do not consider transactions that must be executed given certain 
environmental conditions (i.e., the database state). In order to include active 
rules in this style of specifications, it is necessary to extend the situation calculus 
with executions. This is necessary, given that the actions specified by active 
rules must be forced to be executed when the associated Event happens and the 
corresponding Condition is satisfied. That is, the future is not open to all possible 
evolutions, but constrained by the necessary execution of actions mentioned in 
the rules that fire, given that their related conditions hold. 

The notion of execution in SC was first introduced in [24]®. This problem 
has subsequently been treated in [19,22]. Our discussion is based upon [22]. 
The starting point is the observation that every situation s identifies a unique 
sequence of actions. That is, situations can be identified with the history of 
actions that lead to them (starting in Sq)\ s = do(a„, . . . {do{a 2 , do{a\, S'o)) . . . ). 
We say that the actions oi, 02 , . . . , a„ belong to the history of s. The predicate 
executed^ that takes an action and a situation as arguments, is introduced in 
order to specify constraints in valid or legal histories. For illustration purposes, 
let us assume that we have situations S, S', such that S < S'. Further, assume 
that we specify that executed{A, S). The fact that A has to have been executed 
in S, from the perspective of S' , should entail that action A must appear in the 
history of S' (unless S' were not legal), immediately after S. To specify such a 
constraint we use the predicate legal for situations. This predicate characterizes 
the situations that conform to the executions that should arise in their histories; 
the specification of legal is as follows: 

legal(sh) = So < Sh /\ Va, s {s < Sh /\ executed{a, s) D do{a, s) < Sh). (16) 

The notion of legality, defined with the legal predicate, introduces a more re- 
strictive form of legality for situations than the notion strictly based upon the 
Pass predicate. A situation is considered legal, in this more restrictive sense, if 
the executions that must arise in its history appear in it, and if all the situa- 
tions in the history are reached by performing possible transactions starting in 
S'o. When modeling active databases, we consider that situations are legal when 
their histories are consistent with the intended semantics of the rule executions. 

Whenever the condition of an active rule in consideration is satisfied, the 
action mentioned in the rule must be executed. Therefore, the specification of 
an active rule must include the presence of the predicate executed associated 
to its Action, actually, an internal action in our context. In a database whose 
situations are all legal will be such that active rules, when triggered, are properly 
dealt with. Thus, in a situation calculus tree we only consider branches in which 
actions that must be executed by rule triggerings are actually executed. 

Now we can force internal actions to be executed. This is specified as follows: 

\/a, s Poss{a, s) A Internal(a) D executed{a, s). (17) 

That is, if an internal action is possible, it must be executed. In this way the 
repairing internal actions are executed immediately. Nevertheless, it should not 



It was called occurrence there. 
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be difficult to specify in our SC formalism delayed executions. In [4], these issues 
are considered along with other issues dealing with execution priority of rules, 
and execution of complex transactions. 

The active rule (13) is not written in the SC object language of the speci- 
fication, and in that sense its semantics is not integrated with the rest of the 
first-order semantics. Nevertheless, it can now be eliminated from the specifica- 
tion in favor of axioms (8), (9), (10), and (17). Recall that, in addition to the 
introduction of these new axioms, the SSA (4) for table R has to be modified 
by introducing a = Ar in the formula 7)!)(a, x, s), since new positive effects have 
been specified for table R. This possible re-computation of the SSAs is a very 
simple task. Only one action for IC with its condition has to be plugged into an 
SSA. If the active rules for database maintenance are given in advance, then the 
SSAs for the tables can be computed incorporating the corresponding actions 
from the very beginning. 

Now, it can be proved that the following formula is a logical consequence of 
the new specification: 

ys{So < s A stable{s) D Vx (i?i(x, s) A i? 2 (^, s) D R{x,s)). (18) 

It says that the IC is satisfied at all stable situations of the database. 

There is, however, a problem with the above axiomatization in the context 
of our specification. The specification of executions in [22] is given in a situation 
calculus with concurrent (simultaneous) actions. In our specification, primitive 
actions are executed non concurrently. This may be a problem if two separate 
IC repairing actions are possible in the same situation. In fact, assume that 
two separate ICs are violated in a given situation S. Assume further that there 
are two internal repairing actions A\ and A 2 , that are defined for each of these 
two ICs. Since both ICs are violated in S, then we need both Ai and A 2 to be 
executable in S. 

The situation calculus that we have been using is non- concurrent, in the sense 
that given a situation s, any successor situation is obtained by the execution of 
a single primitive action, one way out of this problem is to ensure that the 
views that record changes to the databases {Changes in the running example) 
are updated only after non-internal actions are executed. In the example, we 
can non-deterministically execute oi , and reevaluate the applicability of 02 once 
the first repairing action has been executed. In this case, it would be possible to 
have oi repair both ICs without having to execute 02 - It would also be possible 
to have situations where one repair introduces other violations to ICs, forcing 
yet other repairs. Chaining of repairs and further details related to this problem 
are still to be worked out. In order for this approach to work, we need to drop 
axiom (17) in favor of: 

\/a, s Poss{a, s) A Internal{a) D pexecuted{a, s), (19) 

iys)[{3a)pexecuted{a,s) D {3b)executed{b, s) A pexecuted{b, s)]. (20) 

Here, we introduce the predicate pexecuted to represent the notion of possi- 
bly executed. Thus, if some action is possibly executed in a situation s, then 
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some possibly executed action must be executed in s. Notice that this has a 
non-deterministic flavor. Thus, if several internal actions are possible, then the 
specification is satisfied if either of them is executed. 

5 A Causal Approach to Integrity Constraints 

In this paper we have not considered the problem of determining all possible 
repairs of a database in detail. From a logical point of view, there are many pos- 
sible minimal repairs for an inconsistent database [11,8]. In principle, we could 
choose any of them and specify corresponding maintenance rules for enforcing 
that particular kind of repair. This could be accommodated in our formalism. 
Nevertheless, we might have some preference for some repairs instead of oth- 
ers. For example, we may want to keep the changes produced by a sequence of 
primitive transactions even in the case they take the database to a state that 
does not satisfy the ICs. In this case, we would generate new, additional changes 
which restore the consistency of the database, pruning out some of the logically 
possible repairs (like the ones that undo some of the new primitive transactions). 
This kind of repairs are possible only if the updated database is consistent with 
the ICs [27]. 

In our approach, there is implicit a notion of causality behind the ICs (see 
example 5). There are cases where the ICs have an implicit causal contents, 
and making them explicit may help us restrict ourselves, as specifiers, to some 
preferred forms of database repairs, like in our running example. Introducing 
explicit causality relations into the ICs can be seen as form of user intervention 
[5], that turns out to be a way of predetermining preferred forms of database 
repairs. 

It is possible to make explicit the causal relation behind a given integrity 
constraint by means of a new causality predicate, as introduced by Lin in [13]. 
This avoids considering a causal relation as a classical implicative relation. 

Lin’s approach is also based upon the situation calculus, albeit in a different 
dialect. The main difference is that the tables are not predicates but functions. 
For instance, in order to express that a property p is true of an object x in a 
situation s, we write p(x,s). In the dialect used by Lin, this same statement is 
written as Holds{p{x),s). The advantage of treating tables at the object level, 
is that one can use properties as arguments in a first-order setting. In particular, 
Lin’s approach to causality is based upon the introduction of a special predicate 
Caused, which takes a table, a truth value^, and a situation as arguments. Thus, 
one can write Caused{p{x), True,s) with the intent of stating that table p has 
been caused to be True of x in situation s. In our framework, we will use the 
alternative syntax Caused{p{x,s), True), as a meta- formula, and will eliminate 
the Caused predicate, as discussed below. 

In Lin’s approach, the Caused predicate is treated in a special manner. First, 
there is an assumption, formalized using circumscription [16], that Caused has 

^ In Lin’s framework, a special sort for truth values is introduced. The sort is fixed, 
with two elements denoted with the constants True and False respectively. 
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minimal extent. That is, Caused is assumed to be false, unless it must be True. 
Furthermore, if a table is caused to be true (false) in a situation, then it must 
be true (false) in that situation. If there is no cause for the table to take a truth 
value, then the table does not change. 

It turns out that it is possible, in many interesting cases, to translate Lin’s 
handling of the Caused predicate into a specification in the style proposed in 
this article (making use of Internal actions) [21]. We illustrate this approach by 
interpreting Caused as syntactic sugar and by providing a translation of a causal 
formula to our language. 

The IC (7) of our example, can be expressed in causal terms as follows: 

Changes Q^g^^^{stu,c,g,s) D (yg')[g ^ g' T> Caused{Crade{stu,c,g' ,s), False)]. 

( 21 ) 

Keeping in mind that Changes Q^ade records the addition of a grade for a student 
in a course, the formula above should be interpreted as if the grade g has been 
provided for student stu in course c, then there is a cause for the student not to 
have any other grade. 

To eliminate Lin’s causality predicate, taking the specification back to the 
formalism based on table names, actions and situations only, we pursue the 
following idea. We admit the existence of unstable situations in which the causal 
rules (ICs) can be violated. In these unstable situations some internal actions 
become possible which repair the ICs. The approach introduces a new action 
function per causal rule. Let Ai denote the new action function for rule (21). 
The rule is replaced by the following axioms: 

Internal{Ai{stu,c,g)). (22) 

Poss{Ai{stu,c,g),s) = Changes Q.^g^^^{stu, c, g, s) /\ 

“■(Vg')b 9 A -'Crade{.stu,c,g', do{Ai{stu,c, g), s)] 

(23) 

Poss{Ai{stu, c, g), s) D (yg')[g ^ g' a -<Crade{stu,c,g',s)] (24) 

Notice that the elimination of the causal relation follows a mechanical proce- 
dure which can be applied to a set of stratified causal rules, as defined by Lin. 
The stratification simply ensures that there are no circular causalities. In this 
setting, it can be proved that both approaches, using explicit causal rules, and 
the translation, lead to the same results [21]. 

It is illustrative to compare axioms (22)-(24) with the axioms obtained for 
the integrity constraint (7) in the approach described in Section 3, which results 
in an axiomatization (see example 5) that yields equivalent results. Therefore, 
from a methodological point of view, one can use a logical language extended 
with a causality relation that would enhance the expressive capabilities of the 
language. This extra expressive power gives the modeler a more natural way 
to express preferences regarding database repairs. Furthermore, the semantics 
for the new causal relations can be understood in terms of a more conventional 
logical language. 
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6 Determining Events for the Maintenance Rules 

So far, we have not said much about the events in the active rules (see last 
part of section 3). In [5], Ceri and Widom handle this problem in detail, al- 
though without saying much about deriving Actions for the maintenance rules. 
They provide a mechanism which, from the syntactic form of an IC, derives the 
transition predicate. This transition predicate determines the Event part of the 
active rule that will maintain the constraint. This predicate is defined in terms 
of the primitive transactions that might lead to a violation of the IC. Ceri and 
Widom present a methodology for determining those transactions. The primitive 
transactions considered are insertions, deletions, and updates in tables. 

In our case, we have user defined primitive transactions that may affect sev- 
eral tables simultaneously. In addition, by the presence of the SSAs, we know 
how each base table evolves as legal actions are executed, and which actions may 
affect them. Now, it is possible to associate a view to an IC. Namely, the view 
that stores the tuples that violate the IC (hopefully this view remains empty). 
This is what we have in the RHS of condition (9). Since this view is a derived 
predicate and not one of the base tables in the database, we may not have an 
SSA for it. Nevertheless, as shown in [2], it is always possible to automatically 
derive an SSA for a view. Then, we may easily compute an SSA for the violation 
predicate (or view) associated to an IC. 

Having an SSA of the form (4) for the violation predicate, it can be easily 
detected which are the primitive transactions that can make it change. In partic- 
ular, which primitive transactions can make it change from empty to not empty. 
This change entails a violation of the corresponding IC (this can be detected 
from the y"*" part of the SSA) . In this way, we are in position to obtain a mecha- 
nism for determining events leading to violations of ICs, as in [5]. However, our 
approach can be used for more general primitive transactions. In addition, it also 
allows to identify repairing actions from the derived SSAs®. This possibility is not 
addressed in [5], and that part is left to the application designer; this approach 
can be complemented or replaced by an approach, such as ours, which allows 
the automatic identification of repairing policies. Even in this scenario, the ap- 
plication designer could specify his/her repairing preferences by using causality 
predicates, as described before. 

7 Conclusions 

In this paper we have considered the problem of specifying policies for database 
maintenance in a framework given by the specification of the dynamics of a re- 
lational database. The specification is given in the situation calculus, a language 
that includes both primitive actions and database states at the same level as the 
objects in the database domain. Among others, we find the following advantages 
in using the SC as a specification language: (1) It has a clear and well under- 
stood semantics. (2) Everything already done in the literature with respect to 



In [2] other applications of derived SSAs for views storing violating tuples of ICs are 
presented. 
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applications of predicate logic to DBs can be done here. In particular, all static 
and extensional aspects of databases and query languages are included. (3) Dy- 
namic aspects at the same object level can be considered, in particular, it is 
possible to specify how the database evolves as transactions are executed. (4) 
It is possible to reason in an automated manner from the specification and to 
extract algorithms for different computational tasks from it. (5) In particular, 
it is possible to reason explicitly about DB transactions and their effects. (6) In 
this form it is possible to extend functionalities of usual commercial DBMSs. 

Repairing actions are introduced for the integrity constraints that are ex- 
pected to be satisfied. They are integrated into the original specification by 
providing their effects and preconditions. Then simple active rules are created 
for repairing the ICs. Since these active rules do not have a predicate logic se- 
mantics, there are alternatively specified in the same formalism as the database 
dynamics. 

The ICs are expected to be logical consequences of the modified specification, 
which must be true at every legal state of the database. Nevertheless, IC viola- 
tions may give rise to a transition of the database along a sequence of executions 
of repairing actions, during which the ICs are not necessarily satisfied. Since 
we may not exclude those intermediate states from the database dynamics, we 
distinguish in our formalism between stable and unstable states. It is only at 
stable states where the ICs have to be satisfied, and this can be proved from the 
new specification. Instead, the unstable states are related to executions of the 
repairing actions. 

The original specification has a hypothetical nature, in the sense of describing 
what the database would be like if the actions were executed. Therefore, no 
executions can be said to necessarily occur. To overcome this limitation, we 
extended the formalism with the notion of executed action. In this way, we 
can deal with the imperative nature of active rules, thus forcing executions of 
repairing actions. 

We have considered the derivation of repairing actions for a simple case of 
actions (primitive actions) and ICs. Reparations policies for more complex cases 
of ICs based on sequences of atomic transactions [12] could be integrated in our 
specification formalism. For doing this, a treatment of more complex active rules 
would be necessary. In [4], we developed in an extended formalism for specifying 
a database dynamics, the whole framework needed for specifying this kind of 
active rules, including complex user transactions and the Actions in the rules, 
priorities among rules, database transitions and rollbacks. 

By using the derived specification of the dynamic of views storing the tuples 
that violate an IC, it is possible to determine the right events for the mainte- 
nance rules. For restoring the consistency new, internal, primitive actions are 
introduced. Which repairing actions will be introduced and with which effects 
may depend on the causal contents that the user attributes to the ICs. 

The final result of the whole process of new axiom derivation will be a new 
specification, extending the original one. The resulting specification has a clear 
standard Tarskian semantics, and includes an implicit imperative declaration of 
active rules for IC maintenance. From the resulting specifications, direct first- 
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order automated reasoning is possible; e.g., about the behavior of active rules. 
The causal content of an IC, that the application designer might have in mind, 
can be easily specified in the resulting specification, without leaving its classical 
semantics. In this way, preferences for particular maintenance policies can be 
captured. 
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Abstract. This paper presents a solution to check integrity constraints 
in database systems supporting nested transactions. Using nested tran- 
sactions allows to introduce parallelism inside a transaction and to par- 
tially recover failing transactions by defining a hierarchy of sub-transac- 
tions. If a constraint is violated by some sub-transactions, it is possible 
to reach the validation of the nested transaction, even if some part of it 
had to be aborted. In our solution, (i) only constraints that might be vio- 
lated are checked, (ii) constraints are checked as soon as possible during 
the execution of the nested transaction and (iii) as few sub-transactions 
as possible are aborted. We do not interfere with the execution control 
of nested transactions and users do not have to add any control code 
in the definition of constraints or of transactions. The main idea of our 
solution is to attach the checking of a constraint to the smallest common 
ancestor of the sub-transactions which could violate the constraint. 

Keywords: integrity constraints, nested transactions, partial abort. 



1 Introduction 

It is clear now that the classic model of flat transactions [19] is not suitable to 
express the complexity of nowadays applications. Many works have been devoted 
to extend the flat transaction model in order to allow modeling complex long 
duration transactions and distributed applications in cooperative environments. 
In order to fulfill this need for more flexibility, other transaction models have 
arisen such as nested transactions [34], multi-level transactions [43], Sagas [18], 
ConTracts [38], transactional activity model [11] among others. A detailed revi- 
sion about advanced models of transactions can be seen in [20] . However, none of 
the extended models defined up to now has turned out to be general enough to 
accommodate all types of applications. ACTA [9,10] is a first-order logic-based 
formalism intended to unify the existing models. It can be used as a common 
framework within one can specify and reason about the nature of interactions 
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CNRS-France. 
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between transactions of extended models. [40] extends the ACTA formalism and 
introduces the notion of transaction closure. It considers general transaction 
structures and distinguishes different dependencies among transactions. 

Using nested transactions allows to decompose a transaction into sub-transac- 
tions. This offers the possibility to introduce parallelism inside a transaction, and 
to partially recover failing transactions. Since they were introduced by Moss [34], 
nested transactions have been an object of vast studies. Different variations on 
this model can be found in [14] . Correct and reliable algorithms to manage con- 
currency control, including detection of deadlock and failure recovery have been 
proposed [35,39,24,32]. Some systems providing nested transactions have been 
developed, such as Argus [30], Camelot [16], SIMA [29], and the work described 
in [6] proposes an architecture at three levels to support nested transactions on 
top of standard commercial DBMS’s. Recently, with the commercialisation of 
Encina [4,1] a system that provides nested transaction facilities, many critical 
applications in the industry and bank [2] areas have been developed based on 
this transaction model. 

On the other hand, the growing complexity of present database applications 
necessitates the development of efficient consistency management systems. Con- 
sistency means that the database must be semantically correct. It is classically 
ensured by the definition of integrity constraints, which are assertions defined on 
the database, which must be satisfied at the end of each transaction. Much work 
has been devoted to this problem [36,8,23,7,22] and many DBMS provide now 
this functionality [26,42,15]. Good surveys describing the various approaches are 
given in [21,17]. 

In the great majority of cases, consistency management systems are designed 
for simple and classical flat transaction models and very few solutions have been 
proposed for the management of integrity constraints in the context of nested 
transactions. Although it seems natural that the nested transactions model al- 
lows checking the constraints in each sub-transaction, this it is not necessarily 
the best solution. Indeed, this could cause that a constraint is checked many 
times, if it is touched by more of one sub-transaction of the nested transac- 
tion. Surprisingly, very few approaches have treated this problem. [28] presents 
a transaction model, named NT/PV -nested transactions with predicates and 
versions- which allows correctness without serializability for long-duration tran- 
sactions. In the NT/PV model, each transaction has a pre-condition and a post- 
condition. The pre-condition of every transaction describes the database state 
which is required for the transaction to execute correctly. The post-condition 
describes the database state which would exist after a transaction is executed in 
isolation on a database state which satisfies its pre-condition. In this approach, 
integrity constraints are integrated only in the post-condition of the top-level 
transaction. This allows checking the integrity constraints only in the top-level 
transaction. [12] proposes a mechanism to express and check integrity constraints 
for object-oriented database allowing nested transactions. In this approach, in- 
tegrity constraints are defined at method level, using pre- and post-conditions, 
associated with an exception handling mechanism. An exception is raised whe- 
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never a pre- (post-) condition is violated. The main drawback of this proposal is 
that the programmer is in charge of describing, in a procedural way, the actions 
to perform in case of violation of a constraint. Moreover, he must also decide 
at which level inside the nested transactions these actions are described and 
performed. 

In this paper, we propose a mechanism to manage integrity constraints for 
database systems using nested transactions. This mechanism integrates a con- 
ventional nested transaction execution control with the ability to check global 
integrity constraints as soon as possible during the execution of a transaction. 
We do not interfere with the execution control of nested transactions, which ren- 
ders our approach very flexible. Transparency is provided since users do not have 
to add any control code in the definitions of constraints or of transactions. Inte- 
grity constraints are defined in a global way, at the database schema level and 
are checked by the system. A syntactic analysis allows to know, at transaction 
compile time, the structures manipulated by both the leaf transactions and the 
constraints, and thus to determine the set of constraints that might be violated 
by the nested transaction. The checking code of the constraint is automatically 
inserted into the transaction. 

This paper is organised as follows. Section 2 presents the nested transaction 
model, while Section 3 describes the main principles of Themis, an object data- 
base language allowing to define and automatically check integrity constraints. 
Section 4 presents our solution to integrity checking in the context of nested 
transactions. We first detail our solution in a simple case, where all sub-transac- 
tions must commit in order to commit the whole nested transaction. Then we 
describe our solution in a more general case, where sub-transactions might be 
optional. Section 5 concludes and proposes some directions of future works. 



2 Nested Transactions 

Nested transactions, as presented by Moss [34], are an appropriate solution in 
systems where transactions are “huge” tasks [4,1,6,29,2]. In these cases, a com- 
plex task can be divided into logical functions, which can be performed under 
certain independence and which can as well be divided into other logical tasks 
and so on. In the context of nested transactions, the complex task corresponds 
to the root transaction and each logical function corresponds to one of its sub- 
transactions. We remind in the following the concept of nested transactions and 
their properties. 

A nested transaction (NT) is a transaction formed by 

— operations on database objects, and/or 

— other transactions (called sub-transactions) that can also be nested. 

An NT can be represented as a tree where the nodes are transactions and the 
arcs reflect the nesting relationship that relates a (parent) transaction and its 
child transactions. Each sub-transaction of an NT is executed independently, 
and possibly in parallel; this means that it can decide either to commit or to 
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abort at any time. This decision may depend on the results of the execution of 
its sub-transactions. If a sub-transaction aborts, it does not mean that its parent 
must be aborted, but on the contrary that its sub-transactions must abort. If 
a sub-transaction decides to commit, this is not definitive, since the update of 
the system will only happen when all the transactions that contain it decide to 
commit, i.e. when the root transaction completes. 

This consequently leads to a redefinition of the idea of atomicity of a transac- 
tion (“all or nothing”) for NT. In this context, when a root transaction commits, 
it must guarantee that the updates of all sub-transactions which decided to com- 
mit and having no aborting ancestor, will be taken into account. An NT may 
complete, while preserving consistency, although some of its sub-transactions 
have aborted, which is called in this context “partml abortion" . The behaviour 
of nested transactions with respect to the well known ACID properties can be 
summed up as follows: all the transactions in an NT (including the root) must 
follow this new definition of atomicity as well as the classic property of isolation. 
However, only the root of the NT has to preserve consistency and to be durable. 

Since many transactions can be executed concurrently, the transactions ma- 
nagement system must ensure that operations of one transaction do not interfere 
with other transactions. The classic two phases locking mechanism has been ad- 
apted to nested transactions in [34] and proved by [31]. It states the following: 
when a sub-transaction commits, its locks are not released but are inherited by 
its parent. A transaction can thus have a lock either by request or by inheritance. 
A transaction can only use an object if it requests and obtains a lock on this 
object. Rules for handling read and write locks are described below: 

1. A transaction may obtain a lock in write mode if all other transactions 
holding the lock (in any mode) are ancestors of the requesting transaction. 

2. A transaction may obtain a lock in read mode if all other transactions holding 
the lock, in write mode, are ancestors of the requesting transaction. 

3. When a transaction aborts, all its locks, both read and write are simply 
discarded. If any other transaction among its ancestors holds the same locks, 
it continues to do so, in the same mode as before the abort. 

4. When a non-root transaction commits, all its locks, both read and write, are 
inherited by its parent. This means the parent holds each of those locks, in 
the same mode as the child held them. 

5. When a transaction cannot obtain a lock, it must wait until it is granted. 

6. Once a transaction obtains a lock, it cannot release it before its termination 
(abort or commit). 

The preceding rules guarantee the serializability of nested transactions [31]. An 
object can be updated by only one transaction at the same time. When a tran- 
saction aborts, only its descendants will be affected by this abortion since the 
modifications of the objects used by that transaction are known only by the 
sub-transactions belonging to its sub-tree. 
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3 Integrity Constraints 

Databases are supposed to be consistent. Consistency is generally assured by 
integrity constraints (IC), which are logical assertions that must always hold in 
the database. A database state is consistent if and only if all constraints are sa- 
tisfied. Much work concerning the checking of integrity constraints has already 
been done [36,8,23,7,22]. Good surveys describing the various approaches are 
given in [21,17]. Some approaches are adapted for on-line transactions, such as 
active rules and triggers [45] where the user is in charge of determining the events 
raising the checking of a given constraint. Other solutions use compilation tech- 
niques to reduce the constraint checking process at execution time [5,25,41,27]. 
Our work is related to this second approach. We choose to use Themis to illu- 
strate our proposal. However, it can be adapted to any constraint management 
mechanism using the same principles. 

Themis [3,37,13,33] is an object database programming language that sup- 
ports the specification of integrity constraints. In this language, IC are defined in 
a global and declarative way. A syntactic analysis of both constraints and tran- 
sactions allows to automatically determine a set of constraints which could be 
violated by a transaction, and checking code is automatically generated. We pre- 
sent in this section the main features of Themis. For more details about Themis 
the reader may refer to [3] . 

3.1 Basic Concepts of Themis 

Themis is a strongly and statically typed object-oriented database language. 
A schema in Themis is defined in a classic way using concrete and abstract 
types, classes and integrity constraints. Concrete types are recursively built using 
atomic types (integer, string and boolean) and constructors (tuple, set and list). 
Abstract types are composed of a structural part, which is similar to concrete 
types, and a behavioural part. Instances of concrete types are non-shared values, 
while instances of abstract types are objects, having an identity independent of 
their values. Objects may be shared. Classes in Themis are the persistent roots 
of the database. 

Integrity constraints are boolean expressions built using the classes of the 
schema, and general operators. First, terms are defined as follows: 

— Constants are terms. 

— Each variable cc is a term. 

— Let t be a term, let a be an attribute, t.a is a term, (t is a tuple structured 
attribute, and a is an attribute). 

— Let f be a term, let be variables, let m be a method, t.m{xi ■ ■ ■ x„) 

is a term. 

— Let ti and t 2 be two terms, let 9 be an arithmetic operator (-I-, — , *, 4-), ti9t2 
is a term. 

An integrity constraint C is an expression of the form: 

C = Qxi G Si, - ■■ Qxk G SkM{xi, ■■■ ,Xk) 
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where Q G {V, 3} , 5'i is a set- valued expression (e.g. a class name), and ■ ■ , 

Xk) is a quantifier free formula. 

Formulas are defined as follows: 

— Let 0 be a comparison predicate (=, yf, <, >, <, >), let x and y be two terms, 
then X 0 y is an atomic formula. 

— Each atomic formula is a formula. 

— Let M and M' be two formulas, M AM',M\/ M', -iM and (M) are formulas. 

— Nothing else is a formula. 

Updates are performed through transactions. In [3], only fiat transactions are 
considered. A fiat transaction FT has the following syntactic form: 



FT={ti---,tn){F} 



where the ti are parameters and F represents the body of the transaction, built 
using elementary statements: assignment, method call, composition, conditional 
test, iteration loop, insertion of an element in a set and deletion of an element 
from a set. 



3.2 Nested Transactions in Themis 

In order to consider nested transactions, we extend the syntax of Themis with 
the following instructions: 

— ’{’ and ’}’ are used as sub-transactions delimiters. 

— ’ll’ and ’;’ which indicate the execution mode (EM) of sub-transactions 
(respectively concurrent and sequential execution). 

— ’op’ and ’ob’ (denoted O) indicate whether a sub-transaction is optional 
(op) or obligatory (ob). A sub-transaction is obligatory if its abort implies 
the abort of its parent, otherwise it is optional 

— Sub-transactions {STi) are defined in the same way as transactions. 

A nested transaction NT, is syntactically defined in the following way: 



NT= FT I (U • • • , tn) EM {OSTi • • • OST„} 

STi = NT 

FT= (u---,t„){r} 

In this way, Themis allows to define nested transactions where only leaf tran- 
sactions manipulate the data. This does not affect the expressive power of the 
transaction model, since every general nested transaction can be transformed 
into an equivalent nested transaction where only leaf transactions can update 
objects, i.e., as shown in [34], the two models are equivalent. These leaf transac- 
tions are structured in the same way that flat transactions in Themis. 



^ They are respectively called vital and not-vital in [9,10,40], critical and not-critical 
in [6] 
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3.3 Principles of Checking Integrity Constraints in Themis 

The checking process of integrity constraints in Themis consists of two steps: 

1. reduce the number of constraints to be checked by a transaction at compile 

time, and 

2. automatically generate an efficient run-time checker. 

For the first step, a syntactic analysis of both constraints and transactions is 
used to determine which constraints might be violated by a given transaction. 
Intuitively, a transaction T might violate a constraint C if they manipulate the 
same data structures. The basic principle of this purely syntactic analysis consists 
of detecting, for each constraint and each transaction, the set of structures it 
involves. The result of the analysis is a set of paths in the database, which 
gathers the various structures. When the intersection of the analysis of T and 
the analysis of C is not empty, T might violate C. For the sake of simplicity, 
we say that transaction T touches constraint C. The syntactic analysis process 
for flat transactions and constraints that do not include methods is described 
in [3]. In [37] the syntactic analysis is extended to constraints that can include 
methods. In the nested transaction model, we consider that only leaf transactions 
can modify the database and these ones are flat transactions. Thus, it is possible 
to apply the same syntactic analysis defined in [37] to each leaf transaction and 
to determine, at compile time, the set of constraints that it touches. Therefore 
the constraints set touched by a parent transaction is the union of the constraints 
touched by its sub-transactions. 

Given a transaction and a constraint touched by this transaction, the second 
step consists of generating a checking algorithm. This algorithm will operate 
on the smallest set of objects involved in the checking process. This set is de- 
termined at run-time. It is composed of objects whose attributes which have 
been modified by the transaction are relevant to the constraint. [3] details how 
to automatically generate optimised algorithms for constraint checking. Those 
algorithms are generated, at the end of flat transactions, for a sub-class of for- 
mulae: universally quantified formulae. In this work, we keep the same principle 
of generation of checking algorithms, as proposed in [3] . 

3.4 Example 

In this section, we give an example to illustrate the concepts of Themis presented 
above. Figure 1 shows the schema of a Restaurant database. ’[...]’ denotes a 
tuple, ’{. . .}’ denotes a set. 

This schema has six persistent roots (classes). They are sets of abstract type 
instances. The abstract type Item represents the dishes offered in the restaurant. 
Dishes are classified in four categories: first courses, main courses, desserts and 
drinks. They are respectively modeled by classes First_Courses, Main_Courses, 
Desserts and Drinks. Dishes are characterised by a name, a price, and the quantity 
available in the restaurant. This quantity is automatically updated each time a 
dish is ordered. The abstract type Order models a client order. The attributes 
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\type Item 


is abstra t [ 


name: string. 


end-, 

type Order is abstra t [ 


price : real , 
quantity : integer ] 

num-client : integer, 


method 


orderf irstcourseO ; 


date: string, 
description: list (Item) ] 


end] 

lass 


ordermaincourse 0 ; 
orderdessert 0 ; 
orderdrinkO ; 
nb_orders() : integer; 

Orders : Order; 




lass 


First_Courses : {Item} ; 




lass 


Main_Courses : {Item} ; 




lass 


Desserts : {Item; } 




lass 


Drinks : {Item} ; 




lass 


Items: First_Courses 


Main_Courses Desserts Drinks; 



Fig. 1. Schema of the Restaurant Database 



are self-understandable. The methods orderfirstcourse(), ordermaincourse(), order- 
dessert(), and orderdrink() are used to perform an order, one per category of dish: 
an item is chosen by the client in the corresponding category. The item is added 
to the order description list and its available quantity is decreased by one. The 
last method, nb_orders(), gives the number of dishes ordered by a client. 

For this schema, we define four integrity constraints, presented in Figure 2. 
Constraint Cio expresses that each item must have a unique name. Constraints 
C2o and C30 are needed for the management of the item stocks. They specify 
that the quantity of an item must be positive (otherwise it cannot be served!). We 
only give here two of these constraints (for First_Courses and Main.Courses), but 
obviously the same kind of constraint exists for the other categories of dishes. 
Finally, constraint C40 states that an order must contain at least two items. 
Figure 3 gives an example of a nested transaction representing the order of a 
client. Transaction T5 is a sequence (specified by of three sub-transactions, 
T51, T52, and T53. T52 is composed of three sub-transactions (T521, T^22, and 
T523) executed concurrently (specified by ’||’). T^i is obligatory (specified by 
’ob’ ), as well as T52 and T^23- The other sub-transactions are optional. 



( 


10) 


ei, 


62 Items, 6i = 62 V 61 name = 62 nam6| 


( 


20) 


/ 


First- 


ourses, f quantity > 0 


( 


30) 


m 


Main- 


ourses, m quantity > 0 


( 


40 ) 




Orders, 


nb-ordersQ) > 2 



Fig. 2. Integrity Constraints 
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n 0 


; { 








* T51 * 


ob 0 


{ 


0 =new (Order) } 




* T52 * 


ob 0 


II 


{ op 0 {0 — >■ orderdrink{)} 


* T521 * 








op 0 {0 — >■ order first ourse{)} 


* T522 * 








ob 0 {0 ^ ordermain oursei)} } 


* T523 * 


* T53 * 


op 0 


{ 


0 — >■ orderdessert{)} 


} 



Fig. 3 . Nested Transaction in Themis 




Fig. 4 . Nested Transaction Tree 



Figure 4 represents nested transaction T5 as a tree, and the constraints touched 
by the leaves (sub-transactions which perform updates). C40 is a general con- 
straint, concerning every order, thus each sub-transaction. C20 only concerns 
FirstXourses, it must be satisfied after T^22- The same holds for C30 and T523. 
Cio is not concerned by this transaction which does not modify the name of 
Items. Constraint <740 must obviously be verified at the end of the root tran- 
saction. This is not the case for the other constraints, which can be verified 
earlier, at a lower level in the tree. Indeed the set of objects which should be 
used for the checking of a constraint can be obtained immediately after the last 
sub-transaction that touches it has finished. In absence of failures, the state of 
the database with regard to this constraint will no more change. Our goal is 
to determine the level at which a constraint must be verified. Our solution is 
presented in the following. 



4 Checking Integrity Constraints in Nested Transactions 

Checking integrity constraints may be very costly when long-duration transac- 
tions are performed using a conventional fiat transaction model. Nested tran- 
sactions have been proved to be much more adapted to cope with the problems 
of long-duration transactions. As systems supporting nested transactions have 
been implemented, including execution control, concurrency control, and reco- 





Nested Transactions with Integrity Constraints 



139 



very control, they may be used to develop a mechanism allowing to check in- 
tegrity constraints integrated with the execution of nested transactions. The 
main idea of our solution is to choose for each constraint a sub-transaction 
which will be responsible of its checking: the smallest common ancestor of the 
sub-transactions touching it. The communication between the leaf transactions 
touching the constraint and the sub-transaction responsible for its checking al- 
lows to launch the checking of a constraint as soon as all the sub-transactions 
touching it have terminated and, in case of violation, to abort as few operations 
as possible. We also take into account the possible re-checking of a constraint 
if a sub-transaction touching it aborts because of another constraint. We first 
describe the hypothesis about the execution environment we address. Then we 
develop the different steps of our solution. 

4.1 Description of the Environment 

We sum up here the features of the nested transaction system and of the integrity 
constraint manager, which we take into account : 

— Only leaf transactions can update objects, being thus the only ones likely 
to violate the consistency of the database (cf. 3 . 2 ). This restriction allows a 
simplified definition of the consistency control in nested transactions. 

— At transaction compile time, we assume that we can get information about its 
structure. We particularly need to know which node is the smallest common 
ancestor of a set of leaf transactions. This does not raise any problem when 
defining nested transactions as shown in 3 . 2 . 

— Integrity constraints are expressed using the Themis language. When a tran- 
saction is created and compiled, it is possible to determine -through a syntac- 
tic analysis- which constraints are touched, i.e., nearly to be violated by the 
transaction. Applying this analysis to the components of a nested transac- 
tion, we determine: 

— For each leaf transaction Tlj of the nested transaction T, the set C(Tlj) 
of all the constraints touched by Tlj. 

— In the same way we can determine the set T{Ci) of all the leaf transac- 
tions touching a given constraint Ci. We note SCA{Ci) the smallest 
common ancestor of all the leaf transactions included in T{Ci). SCA{Ci) 
is responsible for the maintenance of constraint Ci. 

In the remainder of this article, we use the example shown in Figure 5 . The 
whole tree of the nested transaction T is represented, each transaction being 
represented by a node in the tree. Below each leaf transaction Tlj, the set of 
constraints touched by Tlj, i.e. C{Tlj) is shown. For instance, C{Tl2n) = {C2}. 
We can see that T(C'i) = {Tl2i2,Tl222}, be. constraint Ci is touched by leaves 
T/212 and Tl222- T2 is the smallest common ancestor of TI212 and TI222, T2 is 
thus the node responsible for the checking of Ci. The checking procedure of 
each constraint Ci is represented by a rectangle and associated with the node 
sub-transaction responsible to perform it. For instance, T2 is associated with the 
checking procedure of Ci. 
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[CheckjCs)] 



Fig. 5. Example of Nested Transaction 



4.2 The Proposed Solution 

Taking into account the behaviour rules of a nested transaction described in 
Section 2, together with the principles of Themis described in Section 3, it is 
possible to integrate a consistency checking mechanism into the nested transac- 
tion execution control. Our main goal is twofold: 

1. To check a constraint as soon as possible, i.e. as soon as all the operations 
touching it are performed, and in case of a violation, to abort as few sub- 
transactions as possible, and 

2. to interfere as few as possible with the underlying execution control and 
concurrency control mechanisms. 

Our solution ensures that if the root transaction commits, then the database 
is in a consistent state. The key point is to designate for each constraint Ci, 
a node sub-transaction to be responsible of its checking. For this purpose, we 
determine, at transaction compile-time, the smallest common ancestor of all the 
leaf transactions touching the constraint Ci , i.e. SCA{Ci) as defined above. As 
this node controls the execution of its sub-tree, it is possible to ensure, in case 
of violation of Ci, that not only the sub-transactions which violated Ci , but 
also the ones which used their results, will be aborted. On the other hand, as 
opposed to conventional integrity mechanisms for flat transactions, we let other 
sub-transactions not concerned with Ci continue their execution. 

Transaction SCA{Ci), responsible for checking constraint Ci, knows the iden- 
tification of all its leaf transactions touching Ci. Each of those leaf transac- 
tions knows the identification of SCA{Ci). When all of them have termina- 
ted, SCA{Ci) can launch the checking procedure Check{Ci). If Ci is satisfied, 
SCA{Ci) has no action to perform. Otherwise, if Q is violated, then SCA{Ci) 
performs a set of actions in order to restore consistency. 

In the remainder of the paper, we present how we detect a constraint vio- 
lation and which actions are performed to restore consistency of the database. 
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To clarify this article, we distinguish two cases. In the first case, which we call 
“all mandatory”, all the children of a sub-transaction have to commit to ena- 
ble the sub-transaction to commit. Thus the root of the nested transaction can 
only commit when all nodes of the tree representing the nested transaction have 
decided to commit. In the second -general- case, called “with options”, a node 
transaction can commit even if some of its child transactions have decided to 
abort (according to predefined users decisions). We first treat the case “all man- 
datory” since it is much simpler, and then extend the solution to cope with the 
general case “with options” . 



The Simple Case “All Mandatory”. If all the children of a transaction 
have to commit to enable their parent to commit, it is clear that, as soon as a 
constraint violation is detected, the whole nested transaction (included its root 
node) has to abort. As a consequence, in this simple case, our problem can be 
expressed as follows: 

1. to detect as soon as possible if a constraint is violated and 

2. to abort the whole transaction as fast as possible in case of any constraint 

violation. 

This solution can be implemented in a very straightforward way. When a leaf 
transaction has terminated its actions, it sends (for each constraint Ci it tou- 
ches) an additional message to notify SCA{Ci) of its decision (commit or abort). 
When SCA{Ci) has received decision messages from all the leaf transactions tou- 
ching Ci and at least one of those decision messages is “commit”, it launches 
the checking procedure of Q. This clearly solves the issue 1. In case of violation 
of Ci, SCA{Ci) sends an abort message to the root of the nested transaction 
which will at its turn send abort messages to all its children and recursively to 
all the nested transaction, solving the issue 2. This recursive way of aborting the 
nested transaction is already implemented in the execution and recovery control 
mechanism of nested transaction systems. Of course, as in this case the nested 
transaction model does not allow optional sub-transactions, the only solution to 
restore consistency is to abort the whole transaction, as in the case of conventio- 
nal flat transactions. But the advantage of the solution is that any long duration 
transaction written in a flat transaction model can be translated in a rather sim- 
ple way to a nested transaction with all sub-transactions mandatory. Thus our 
solution allows, with few effort for the programmers, to detect inconsistencies 
and abort the transaction as soon as possible, which is already a substantial gain 
in case of long duration transaction. 

Of course, in case of a nested transaction having optional sub-transactions, 
the abort of the whole transaction if one constraint is violated is not satisfying. 
Indeed there may be a possibility to abort only optional sub-transactions and 
then to commit as many as possible sub-transactions. The new definition of 
atomicity induced by the possibility of optional sub-transactions allows a refined 
treatment of consistency management explained in the following subsection. 
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The General Case with Optional Sub-transactions. In the general case, 
we use the feature of “partial abortion’’^ included in the nested transaction model 
to refine our solution and obtain a database consistent state without having to 
abort the whole transaction. As explained in Section 2, each sub-transaction 
can commit even if some of its children declared as optional decide to abort. As 
there is a possibility to maintain consistency in the database without aborting the 
whole transaction, the problem is more complicated. While the issue (1) remains 
the same, the issue (2) is turned into the more complex (2’), which consists of 
“abort as few sub-transactions as possible from the nested transaction” . 

In this context, each time a constraint Ci is violated, we must abort the leaf 
transactions that touche Ci, and recursively the sub-transactions which already 
used their results. 

All the sub-transactions to be aborted because of the violation of constraint 
Ci are inside the sub-tree of SCA{Ci). During the abort process inside this sub- 
tree, the sub-transactions not belonging to this sub-tree may continue their exe- 
cution until reaching the eventual commit of the nested transaction. Of course, 
the abort of some sub-transaction within the sub-tree may raise the abort of 
SCA{Ci) itself, but this decision of SCA{Ci) will be handled by the nested 
transaction execution control. 

To abort as few sub-transactions as possible in case of the violation of Ci, we 
process as follows: SCA{Ci) sends a special abort message through the branches 
of its sub-tree which contain leaf transactions involved in the violation of Ci. 
The message is propagated top-down until reaching the smallest ancestor still 
active (i.e. not having yet decided) of each of those involved leaf transactions. 
This smallest ancestor will take the decision to abort, consequently ordering its 
whole sub-tree to abort. 

In the remaining of this subsection, we describe the sub-transactions beha- 
viour according to the above mentioned principles. We use the example shown 
in Figure 5 to represent the different cases. 

As mentioned in subsection 4.2, our solution is based on the fact that we 
can determine the structural information about nested transaction T at compile 
time. This means that we can get the complete structure G(T) of the tree formed 
by T: identification of all the sub-transactions and parent-children relationships. 
As well, we can determine C{Tlj) which is the set of constraints that a leaf tran- 
saction Tlj could violate. This information allows us to build a table CCT (Con- 
straints Checking Table), with one entry CCT{Ci) per constraint Ci, as follows: 

CCT{Ci) = {Ci,SCA{Ci),list{Tlj,Termination(Tlj))) 

where 

— Ci is the constraint identifier 

— SCA{Ci) is the node responsible for checking Q 

— list{Tlj,Termination{Tlj)) is the list of all the leaf transactions Tlj that 
could violate Ci together with the status of Tlj: 

~ ’S’ if Tlj is still in progress, 

— ’A’ if Tlj has aborted, and 

— ’V’ if Tlj decided to commit. 
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For instance, the CCT table corresponding to the example of Figure 5 would be 
initialised as follows: 



Table 1. Constraints Checking Table 



i 


( 0 


list(Tlj ,Termination(Tlj)) 


1 


T 2 


((TI 212 , ),(Tl 222 , )) 


2 


T 


((Thi2, ),(Tl2ii, )) 


3 


Tim 


{{Tim, )) 



The following four points describe the sub-transactions behaviour: 

1 . When a leaf transaction Tlj terminates its execution, it sends its decision 
(commit or abort) and its own identifier to each node sub-transaction res- 
ponsible for the checking of a constraint it touches: 

Tlj-. on terminating execution, 

sends the termination message (Tlj, 'A' or 'V') 
to each SCA(Ci), such as Ci€C{Tlj). 

In the example of Figure 5 , 

— Tim will send its decision message to Tim, for the checking of C3, 

— TI112 and TI211 will send their decision message to T, for the checking 
of C2 and 

— TI212 and TI222 will send their decision message to T2 for the checking 
of Cl. 

2 . When a sub-transaction SCA{Ci) receives a decision message from a leaf 
transaction Tlj touching Cj, it updates the Ci entry of the CCT table with 
the decision sent by Tlj, ’A’ or ’V’: 

SCA(Ci): on receiving message (Tlj, 'A' or 'V') , 

switches Termination{Tlj) to ' Al or 'V in CCT{Ci) 

For instance, if T receives a decision message from Tln2 willing to commit, 
it will switch the Termination{Tlii2) from (TCi2,’S’) to (T/n2,’V’) in the 
entry corresponding to C2. 

3 . When SCA{Ci) has received decision messages from all the leaf transactions 
touching Ci, i.e. when no more (Tlj, ’S') remains in the Ci entry of CCT, 
it launches Check(Ci). If the constraint is satisfied, no other action is per- 
formed by SCA(Ci). If the constraint is violated, for each Tlj in CCT(Ci), 
SCA(Ci) sends a message (R*,Ci) to its child which is an ancestor of Tlj 
(which can be determined by looking at C(T)). This message means “roll- 
back Tlj and everything that used the results of Tlj”. Of course, if one 
child of SCA(Ci) already decided to commit, or if SCA(Ci) is the parent 
of Tlj, or SCA(Ci) is Tlj, SCA(Ci) itself must abort. If not, the message 
is propagated. We can sum up the behaviour of SCA(Ci) by the following 
algorithm: 





144 



A. Doucet et al. 



If (V {Tlj,Termination{Tlj)) G CCT{Ci)) Termination{Tlj) S' then 
If iCheck{Ci) then perform Propagate{SC A{Ci),Ci) 

End if 
End if 

where Propagate(trans,constr) is the procedure which performs the propa- 
gation of the message R* . 

Propagate (trans,constr) : 

Set := {Tfc I Tfe child of trans and ancestor of Tlj, 

Tlj G T(constr)} 

If (V Tk G Set, Tk is active) and 

-'{trans parent of Tlj,Tlj G T{constr)) then 
For each G Set, Send(i?*, constr) to 

Else 

Replace each {Tlj,'V') G CCT with {Tlj,' A') 

Abort sub-tree of trans 
Abort trans 
End if 

If, in our example, T^ii2 and TI211 both decide to commit and send the 
corresponding message to T, then the entry corresponding to C2 contains the 
list {{TI112,' V), {TI211,' V')) and T checks C2. We suppose C2 satisfied, thus 
no other action is performed. Assume that both TI212 and TI222 decide to 
commit, raising the execution of checking C\ by T2, and that C\ is violated. 
If both T21 and T22 are still active, then T2 sends {R* , C\) to its children T21 
and T22- Otherwise, T2 decides to abort. 

4 . In order to abort each leaf transaction Tlj which touched Ci, the message 
{R* , Ci) is propagated from generation to generation, until reaching the smal- 
lest active ancestor of Tlj, which we denote SAA{Tlj). It is easy to prove 
that the abort of SAA{Tlj), provoking the abort of its sub-tree, corresponds 
to our goal. Indeed, we abort Tlj and all the transactions that, directly or 
indirectly, may have used the results of Tlj. Of course, SAA{Tlj) cannot be 
statically determined, since it depends on the progress state of the nested 
transaction execution. To determine it and perform the adequate rollback, 
each sub-transaction Tk has the following behaviour: 

Tfe : on receiving message {R*,Ci) 

Perform Propagate {Tk,Ci) 

With points 1 to 4 , we ensure that, whenever a constraint is violated, the leaf 
transactions who caused this violation will be aborted, together with the sub- 
transactions which used their results, as few as possible. Thus we ensure that the 
corresponding constraint will be satisfied by the state of the database after an 
eventual commit of the nested transaction. However, this cascading abort may 
cause side effects on the other constraints, forcing us to check them again. 

For instance, consider that in our example Tln2, TI211, and TI212 have ter- 
minated and T has checked C2 and no violation is detected. After that TI222 
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terminates, T2 launches the checking of C\, since both TI212 and T/222 have 
terminated. If C\ is violated, then the smallest active ancestor of TI212 may be 
either T21 or T2. In both cases, the rollback process includes T?2ii- As a side 
effect, the checking of C2 performed by T is no more valid, since it was done 
considering the effects of both Tln2 and Tl2ii- Thus, T has to check again C2 
with only the effects of Tlu2. To warn T that it has to check C2 again, the sub- 
transaction responsible for the abort of TI211, i.e. T21 or T2, sends a message to 
T. It is worth to notice that this new checking of C2 is performed as soon as 
possible. In fact, the first checking performed while the transaction was still in 
progress induced no delay in the process. Even if C2 is checked several times, 
the actual checking (the last one) is always started as soon as the final state of 
the database concerning C2 is produced. 

Another important point is that it is always possible to check again a con- 
straint if the previous checking has been invalidated by the abort of a sub-tree. 
A situation where a constraint Cj has to be checked again by an already com- 
mitted sub-transaction is impossible. Indeed, assume a sub-transaction Tk and 
its sub-tree abort because of a constraint violation. Each constraint Cj touched 
by leaf transactions in this sub-tree has to be checked by SCA{Ci) which is an 
ancestor of these leaf transactions, therefore either an ancestor of Tk or a des- 
cendant of Tk- In the first case, it is sure that SCA{Ci) has not committed yet. 
In the second case, all the sub-transactions touching Ci are included in the sub- 
tree of Tfc. As all of them will be aborted, Ci cannot be violated and there is no 
need to check it again. To take into account eventual re-checking of constraints, 
we change the Propagate procedure so that whenever a sub-transaction aborts, 
it sends a warning message to each sub-transaction responsible for a constraint 
that has to be checked again because of aborting the sub-tree. 

But the abortion of a sub-transaction may occur because of other reasons 
(e.g. deadlock detection), implying constraints to be checked again. This situa- 
tion is out of the control of the Propagate procedure. Two solutions allow taking 
into account aborts due to causes external to the integrity mechanism. The first 
solution is to include the sending of warning messages into the abort process of a 
leaf transaction. This is clearly the most efficient solution but leads to alter the 
nested transaction execution control, which is contradictory with our goal to mo- 
dify as few as possible the underlying transaction system. The second solution is 
that each SCA{Ci), after having received the decision of all its children, checks 
whether Ci has to be checked again or not. This solution can be implemented in 
the behaviour of each node sub-transaction without changing the nested tran- 
saction execution control, but defers the eventual new checking of Ci until the 
whole sub-tree of SCA{Ci) has finished. 



Efficiency of the Checking Process. In our solution, we do not block the 
execution of the nested transaction because of the checking of a constraint. The 
checking process of a constraint is launched by the smallest common ancestor 
of the sub-transactions touching it and performed in parallel with the nested 
transaction. Each time it is possible, the checking process is performed on copies 
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of the involved objects, thus the locks on involved objects can be released by the 
concerned sub-transaction to allow the continuation of the nested transaction. 
This is always the case for universally quantified constraints, were all the objects 
involved by a constraint are collected during the execution of a sub-transaction 
touching it. For existentially quantified constraints, the checking process may 
require to access other objects in the database, thus it may be executed in 
concurrency with some sub-transaction of the nested transaction. The main ad- 
vantage of our approach is that, at least for universally quantified constraints, 
when no constraint is violated, the nested transaction is not disrupted by the 
constraint checking. In other words, our approach is optimistic : when no con- 
straint is violated, the concurrency level of a nested transaction is the same as 
it would be if there were no constraints. 



5 Conclusion 

This article presents a mechanism to maintain integrity constraints in data- 
bases supporting nested transactions. Constraints and nested transactions are 
defined using the general features of Themis enriched with nested transaction 
features. By analysing constraints and transactions, we check constraints as soon 
as possible and, in case of violation, abort as few as possible sub-transactions 
until reaching, whenever possible, the commit of the nested transaction. The key 
point to reach this goal is to attach the control and the checking of a constraint 
to the smallest common ancestor of all the sub-transactions touching the con- 
straint. Our solution does not require any significant change in the execution 
control mechanism of nested transactions and does not impose users to add any 
additional code into transactions and/or constraints. 

In our approach, constraints are maintained by a mechanism integrated with 
the nested transaction execution control. In contrast to the approach of [12], 
where the programmer not only has to define when a constraint has to be checked 
within the execution of the nested transaction, but also must define the behaviour 
to follow whenever a constraint is violated, our approach offers a fully automatic 
consistency management. Moreover, our solution does not require significant 
modifications of the nested transaction manager, thus rendering it adaptable 
to any nested transaction manager. In [28] the checking of integrity constraints 
are attached to the top-level transaction, as opposed to our approach where 
we attach the control and the checking of a constraint to the smallest common 
ancestor of all the sub-transactions touching it. 

Works are in progress to develop and improve our mechanism. A first proto- 
type, which will implement the ideas presented in this paper, is being designed. 
It will not only help us to test the efficiency and the relevance of our approach, 
but also serve as a support for further theoretical developments. 

We currently investigate different points to improve our approach. The first 
point is to adapt our approach to other NT models, such as Open Nested Tran- 
saction [44] and the Transaction Closures [40] which generalises most of the 
existing NT models. The second point consists of using information from the 
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concurrency controller to abort fewer sub-transactions. Indeed, in the present 
version of the mechanism, we abort a node whenever one of its children con- 
cerned by a violated constraint has already committed (cf. the Propagate pro- 
cedure of section 4.2.2). If we can determine that this child has not influenced 
other children of the node, then it is sufficient to abort this child instead of the 
node to ensure that consistency is preserved. The third point concerns distribu- 
ted databases. We must take into account more carefully the cost of additional 
messages, the possibility of loosing messages, and the way in which the checking 
of a constraint can be performed efficiently, depending on the location of the 
involved objects in the network. 
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Abstract. While specifications of queries usually are of a declarative 
nature (since the work of Codd in the early seventies), specifications of 
transactions mainly are of an operational and descriptive nature. Espe- 
cially descriptions of complex transactions (such as cascading deletes) 
tend to be very operational. Declarative specifications of transactions 
usually suffer from the so-called frame problem or do not have a clear 
semantics. Often these descriptions turn out to be nondeterministic as 
well. A problematic consequence is that the semantics of transactions 
and of several related notions is often unclear or even ambiguous. For a 
database designer this surely is not a good starting point for building ap- 
plications. Another tendency we recognize is that the current literature 
on transactions is mainly driven by technical solutions offered by rese- 
arch prototypes and commercial systems and not so much by advanced 
specification requirements from a user’s or database designer’s point of 
view. In our opinion, the research questions should (also) include what 
kind of complex transactions (advanced) users would like to specify (and 
not only what e.g. the expressive power of a given technical solution is), 
and how these specifications can be translated to implementations in 
the currently available (advanced) database management systems. And, 
moreover, was it not our purpose (with the introduction of 4GL’s and the 
like) to become declarative instead of operational, concentrating on the 
“what” instead of the “how” ? This paper offers a general framework for 
declarative specifications of transactions, including complex ones. Tran- 
sactions on a state space U are considered as functions from U into 7/. We 
also take the influence of static and dynamic constraints on the alleged 
transactions into account. This leads to the notion of the adaptation of 
a transaction. Applications of our theory included in this paper are the 
declarative specification of cascading deletes and the distinction between 
allowable and available transitions. Basic set theory is our main vehicle. 

Keywords: Transactions, semantics, transaction models, transaction de- 
sign, database dynamics, declarative specifications of database behavior, 
(static and dynamic) integrity constraints, adaptations, (allowable versus 
available) transitions, cascading deletes. 
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1 Introduction 

While specifications of queries usually have a declarative form ([3,4,9,1])> speci- 
fications of transactions mainly are of an operational/imperative and descriptive 
nature (see [9,1,5]), as still recalled by, e.g., [1]. Especially descriptions of com- 
plex transactions (for instance cascading deletes) tend to be very operational, 
using some kind of execution model. Declarative specifications of transactions 
usually suffer from the so-called frame problem or do not have a clear semantics. 
In the survey [2], Bonner and Kifer discuss various proposals, including their 
strong and weak points. Often these descriptions turn out to be nondetermini- 
stic. This holds in particular for the area of active databases (e.g., [19,12,15]); 
see for instance [1, Section 22.5] for a discussion. A problematic consequence is 
that the semantics of transactions and of several related notions is often unclear 
or even ambiguous. This surely is not a good starting point for building applica- 
tions. Another tendency we recognize is that the current literature on transac- 
tions is mainly driven by technical solutions offered by research prototypes and 
commercial systems (e.g., [16,20]) and not so much by advanced specification 
requirements from a user’s or database designer’s point of view. In our opinion, 
the research questions should (also) include what kind of complex transactions 
(advanced) users such as database designers would like to specify (and not only 
what e.g. the expressive power of a given technical solution is), and how these 
specifications can be translated to implementations in the currently available 
(advanced) database management systems. Moreover, with the introduction of 
4GL’s and the like it was our purpose to become declarative instead of opera- 
tional, concentrating on the “what” instead of the “how” . This intention has to 
apply for transactions as well! This paper contributes to the theory of databases 
by offering a general framework for declarative specifications of transactions. In 
our treatment we take a semantic approach. We also take the influence of static 
and dynamic constraints on the alleged transactions into account, by introdu- 
cing the notion of the adaptation of a transaction. Moreover, we can also put 
the distinction between allowable and available transitions in place. An advan- 
ced application of our theory concerns the declarative specification of cascading 
deletes. In our treatment of cascading deletes, the start set of tuples to be dele- 
ted is not restricted to only one table, the “cascading reference graph” may also 
contain cycles, and rollback is incorporated (in case of a violation of any inte- 
grity constraint). The paper is organized as follows. Section 2 introduces a quite 
general definition of transactions, on arbitrary state spaces and within arbitrary 
transition relations. The adaptation of a transaction (determined by the given 
static and dynamic constraints) is also defined in this section. Theorem 1 shows 
that this operation has indeed the nice properties we want it to have. Section 3 
introduces a formal declarative definition of cascading deletes. Finally, we draw 
our conclusions and sketch our plans for further research in this area. The paper 
also contains an appendix explaining the basic notions and notations we used. 
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2 Transactions and Adaptations 



2.1 Transactions 



The state of an organization is usually liable to change: employees, customers, 
and products come and go, orders and invoices are received and handled, and 
salaries, stocks, and prices go up and down in the meantime. The administrative 
repercussions of such changes — in the field of databases referred to as modi- 
fication or manipulation or maintenance — can be specified formally by means 
of functions that assign to each possible (database) state the (database) state 
reflecting the new situation. We call such functions from a state space U into U 
itself transactions on U. The general definition, for arbitrary state spaces, reads 
as follows: 

Definition 1. IfU is a set, then: 



f is a transaction on U f € U ^ U. 

Example 1. We will use a simple employees-and-departments database as an ex- 
ample to illustrate our points. The database keeps track of the employee number, 
department number, salary, and bonus of each employee, and of the department 
number, manager number, and budget of each department. Our example is ba- 
sed on the following database schema gi, which enumerates each table symbol 
and its corresponding set of attributes: 

gi = {{EMP] ENO, DNO, SAL, BON), 

{DEP-, DNO, MNO, BUD)} 



We now define a database universe EXU over the database schema gi. First, 
the set- valued functions FE and ED introduce the (employee and department) 
attributes and their corresponding value sets. Then the sets WE and WD de- 
termine the set of allowed employee tables and department tables, respectively. 
The function HE introduces the relation symbols (or table names) EMP and 
DEP and associates them with their corresponding sets of allowed tables. Fi- 
nally, the DB universe EXU determines the set of allowed database states. For 
referential purposes we numbered our key constraints and referential constraints; 
for instance, the key constraint (KCl) expresses that employee numbers must 
be uniquely identifying in the employee table, and the inclusion dependency 
(RC2) expresses that each department manager must also be mentioned in the 
employee table. Our notations are defined in the appendix. 



FE = { {ENO-,n), 
{DNO; N), 
{SAL ; N), 
{BON;N)}; 

FD = { {DNO; N), 
{MNO N), 
{BUD;N)}; 



employee number 
department number 
salary 
bonus 

department number 
manager number 
budget 
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WE ={T \ T C and 

{ENO} is u.i. in T}; 

WD = {T \ T C n(^^) and 

{DNO} is u.i. in T}; 

HE = { {EMP; WE), 

\dEP ■ WD)}- 

EXU = {u I v€ U(HE) and 

{t{DNO) I t G v{EMP)} C {t{DNO) \ t G v{DEP)} and | (RCl) 
{t{MNO) I t G v{DEP)} C {t{ENO) \ t G v{EMP)}} \ (RC2) 

We now illustrate the notion of a transaction by means of the DB universe EXU 
and the tuple 



the set of allowed employee tables 
(KCl) 

the set of allowed department tables 
(KC2) 

employees 

departments 



to = {{DNO-, 3), {MNO-, 7), {BUD-, 1000000)} 

We might well be tempted, at first sight, to describe the addition of the depart- 
ment tuple to by means of the following function fi, which assigns to each DB 
state V in EXU the new DB state with to added to the DEP-table-. 



/i = Au G EXU : {{EMP-, v{EMP)), 

{DEP-,v{DEP)\j{to})} 



Yet, the function /i turns out not to be a transaction on EXUl Indeed, /i is 
a function over EXU, but not into EXU: for some v in EXU, fi{v) is not an 
element of EXU. This holds in particular if to ^ v{DEP) while the department 
number 3 does already occur in the table v{DEP) or the employee number 7 
does not occur in the table v{EMP), due to the requirements (KC2) and (RC2), 
respectively. Should we want to leave the state unaltered in those special cases, 
then the description of such an insertion attempt or request (see [2, Page 25]) 
could look like this: 



/2 = Au G EXU -. 



if /i(z;) G EXU 
otherwise 



It is clear from this description that the function is indeed a transaction on 
EXU. 

If, in addition, dynamic constraints have been established on EXU, then it 
must also be the case that (u; fi{v)), the transition from state v to state fi{v), 
is an admissible transition. Suppose for instance that new departments can only 
be added in certain situations, e.g., if the number of employees is a fourfold or, 
shortly after, a fourfold plus one. So, more formally, if the number of employees 
is not a fourfold or a fourfold plus one, then the projection of the new DEP- 
table v'{DEP) on {DNO} must be a subset of the projection of the old DEP- 
table v{DEP) on {DNO}. Let us denote this example of a dynamic constraint 
by (DCl). This dynamic constraint determines our transition relation (see the 
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appendix). So, we want to have the following transition relation Rq (telling us 
which direct state transitions are allowed): 

= {(^;; v') I (v; v') € EXU X EXU and 
if \v{EMP)\ mod 4 ^ {0, 1} 

then t,{d^o}W{DEP)) C tt^dnoMDEP))} \ (DCl) 

Note that this is an example of a conditional inclusion dependency “in time” . 
Again if we want to leave the state unaltered in the event of a non-allowed 
transition, then the intended transaction is 



h = \v& EXU : 



if (v; fi(v)) € Ro 
otherwise 



We want to note that (v; fl(v)) G /I, since fl is a function (and hence a set of 
ordered pairs), and that v G dom{fl). So the function /3 can be rewritten in a 
closed formula as follows (by using the 0-operation as defined in the appendix) : 

/3 = {(w; fi{v)) I V G EXU and {v; fi{v)) G Rq} U 
{(?;;?;) I u G EXU and (w; fi{v)) ^ i?o} 

= id{EXU) 0 (i?o n A) □ 



In Example 1 we put forward that a transaction should also satisfy all dynamic 
constraints and should therefore “fit” in the transition relation Rq. In that case 
we speak of a transaction within Rq. In general we define: 

Definition 2. If R is a relation, then: 



f is a transaction within R f is a function and f Q R. 



We can now summarize the results of Example 1 as follows: 

~ A is not a transaction on EXU, 

— A is a transaction on EXU, yet not a transaction within Rq, and 

— A is a transaction within Rq. 

When we compare Definition 1 with Definition 2 for the special case that R = 
U X U (i.e., in case there are no dynamic constraints), we observe that each 
transaction on W is a transaction within U xU d,s well. We note that the reverse 
does not need to hold, since the domain of a transaction within U xU \s not 
necessarily equal to U (or, in other words, since the transaction is not necessarily 
defined in each state). 

One popular (but implicit) way of specifying the set of “allowable” transiti- 
ons is by specifying the (currently) available transactions on a given database 
universe U. However, we note that we can make a distinction between 

1. the set of (user-defined) allowable transitions as specified by means of a 
transition relation R on the database universe 14, i.e., R x U , and 
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2. the set of (currently) available transitions as determined by a set S of (cur- 
rently) available transactions on 14. 

We note that our distinction between allowable and available transitions corre- 
sponds to the distinction between the behaviour layer and the action layer in 
[14] . We also note that in practice the set S of (currently) available transactions 
might be subject to change independently from R. Of course each available tran- 
sition must also be allowed. On the other hand, whether an allowable transition 
is actually available depends on the set of currently available transactions! So, 
V/ G S' : / C i?, that is, (J S C _R, but not necessarily [J S = R. 



2.2 Adaptations 

The two adjustments made to the function fi in Example 1, finally resulting 
in the transaction within the transition relation i?g, are of a more general 
interest. We therefore introduce the following concept (generalizing the closed 
formula at the end of Example 1): 

Definition 3. IfU is a set and R CU x U and f is a function, then: 

Ad{U, R, /) = id{U) 9{Rnf). 

We call Ad{U, R, /) the adaptation of f (determined) byU and R. We speak of the 
adaptation of f (determined) by 14 if there are no dynamic integrity constraints, 
in which case we are in fact dealing with the transition relation U x U, which 
is the most “liberal” transition relation on 14. We denote this adaptation (solely 
determined by the static constraints) by AdapifA, /): 

Definition 4. IfU is a set and f is a function, then: 

Adap{U, f) = Ad{U,U xU,f). 



We give the following alternative descriptions of the two concepts just introdu- 
ced: 



AdapifA, f) = XvGU : 



AdfLA, R,f) = XveU: 





: if V G dom{f) 


V 


: otherwise 


f{v) 


: if V G dom{f) 


V 


: otherwise 



Returning to Example 1, we observe that 

/2 = Adap{EXU, fi) and 



and f{v) GU 
and {v, f{v)) G R 



f 3 = Ad{EXU,R^,fi) hold. 



In the “otherwise” cases, which means that the alteration attempt is cancelled, 
we speak of a rollback of the alteration attempt. The following theorem shows 
that the adaptations of functions indeed possess the nice properties we would 
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like them to have. The adaptation of a function is indeed a transaction on the 
state space concerned (a and b) and within the underlying transition relation 
extended with the identity function on that state space (c), if the underlying 
transition relation R is reflexive on the state space ZY (as is usually the case) 
then the adaptation is a transaction within that transition relation (d), if the 
function already was a function on the state space and within that transition 
relation concerned then the adaptation operation has no effect anymore (e), 
and (hence) the adaptation operation is “idempotent” for reflexive transition 
relations, i.e. it has no use to apply the adaptation operation more than once 
(f). The proof of this theorem can be found in [7]. 

Theorem 1. IfU is a set and R QU x U and f is a function, then: 

(a) Adap{U,f) is a transaction onU; 

(b) Ad{U,R,f) is a transaction onU; 

(c) AdfU,R,f) is a transaction within RUid(U); 

(d) if R is reflexive onU, then AdfU,R,f) is a transaction within R; 

(e) if f is a transaction onU and within R, then AdfU,R,f) = f; 

(f) if R is reflexive on 14, then Ad{U, R, Ad{U, R, /)) = Ad{U, R, /). 

We also introduced the adaptation concepts above because they model so well 
what the architecture of interactions between a user and a database management 
system could look like in general: for each “naive” function f, given by the user 
as a maintenance attempt (hence, as a special sort of “application”), the DBMS 
could de facto carry out the adaptation of f determined by the DB universe and 
the transition relation as known to the DBMS. Consequently, this means that 
every static and dynamic constraint that can be specified in the DBMS under 
consideration need not be “interwoven” in all those separate applications, but 
could be dealt with by the DBMS itself. As to the extent to which all sorts of 
refined constraints can be given as input to existing DBMSs when specifying DB 
universes and transition relations, we refer the reader to, e.g., “Focal Point 4” 
in [11] or, as the most actual sources, to the reference manuals of the DBMSs 
themselves. For systematic translations of formally specified database universes, 
(static) integrity constraints, and queries into SQL2 we refer to [7, Chapter 9]. 

We stress that the successive application of the adaptations of two functions 
f and g need not at all give the same result as the application of the adaptation 
of the composed function go f as one “atomic unit” (alias one unit of work) . In 
other words, it might be that the transaction AdilA, R, g)oAd{l4, R, /) is another 
function than the transaction Ad{l4, R,go /)! [7, Section 7.1] contains a careful 
case analysis. 

In order to position the common sorts of transaction possibilities supported 
by actual relational DBMS’s and by the SQL2-standard (see [10,6]) within our 
framework, we introduced some special classes of transactions in [7] that together 
cover the largest part of the actual transaction possibilities in SQL. In a sense, 
our transactions constitute a formal semantics for those SQL-statements. 
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3 Cascading Deletes 

A well-known transaction phenomenon in databases is that of cascading deletes 
(see for instance [5,10,13])- As an illustration, consider a database with data on 
a company’s clients and their orders. When we want to delete some clients, we 
(implicitly) might want to delete their orders as well. And if there is a separate 
table for the order lines then we (implicitly) want to delete the corresponding 
order lines too. So, a deletion of clients “triggers” deletions of orders, which 
in turn trigger deletions of order lines. We call this chain of triggering deletes 
cascading deletes. However, if such chains of deletes happen to contain cycles, 
things can become quite complex. The reason for this complexity is that the 
triggering of additional deletes can go on recursively, as the next section will 
show. 



3.1 An Example of Cascading Deletes 

The following example will show how the deletion of one tuple in one table can 
trigger the deletion of all tuples in almost all tables. The example will also be 
helpful in illustrating the ideas in the next section. 

Example 2. Suppose that we have a database universe with six table names, say 
A, B, C, D, E, and F, and with six “cascading” referential integrity constraints, 
say from C to A, from C to B, from D to C, from F to D, from E to D, and from 
B to E; see Figure 1. 




Fig. 1. A diagram with cascading referential integrity constraints 
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In order to introduce our ideas presented in the following sections with a concrete 
example, we will consider an actual database state with its referencing tuples as 
depicted in Figure 2. For simplicity we will call the tuples Al, A2, Bl, B2, 
and so on. 




Fig. 2. An actual database state with referencing tuples 



Below we will “calculate” the cumulative effect of deleting tuple 1 from the B- 
table; here “A — >■ Y” stands for the fact that the deletion of the tuples X triggers 
the deletion of the tuples Y. 



Bl Cl, C2 Dl, D2, D3 El, E2, E3, F1-F6 

B2, B3 C3, C4 D4, D5, D6 E4, E5, E6, F7 

B4, B5, B6 C5 + 

B1-B6 C1-C5 D1-D6 E1-E6, F1-F7 



Fig. 3. The cumulative effect of the cascading delete of tuple Bl 



Adding everything up, we see that within a few cycles all tuples of all tables 
but the A-table will be deleted! □ 

3.2 Informal Sketch of the Basic Ideas 

We now turn to the semantics of cascading deletes and want to deduce a general, 
declarative formal specification of the result of cascading deletes. We will start 
with an informal sketch of the basic ideas of our solution. The core of our solution 





Declarative Specifications of Complex Transactions 



159 



is to consider the transitive closure of the “graph of referencing tuples” (such as 
depicted in Figure 2). In order to specify this graph, we “label” each tuple with 
the name of its table. Hence, the set of all labelled tuples of a database state v 
is 



{(if; t) I if G dom{v) and t G v{E)}, 

which can also be denoted as (jj u, the generalized disjoint union of v (see the 
appendix for the terminology and notations we used). The “referencing tuple 
graph” Rtg consisting of all reference pairs of labelled tuples can be described 
informally as 

Rtg = {((if; t); (if'; t')) \ tuple t in table v{E) refers to tuple t' in table v{E')}. 

For the determination of the graph Rtg we have to consider only those referential 
integrity constraints that are declared as “cascading” by the database designer. 
We will work this out later by adding the set of those referential integrity con- 
straints as a parameter. Let Delset denote the set of all labelled (!) tuples to be 
deleted initially. We note that Delset need not be restricted to only one table. 
The set of all labelled tuples to be deleted eventually will consist of 

— all elements of Delset and 

— all labelled tuples that directly or indirectly refer to an element in Delset: 

Delset U {(if; t)\3y G Delset : ((if; t); y) G Tcl(Rtg)}, 

where Tcl{Rtg) denotes the transitive closure of Rtg, i.e., the set of (be- 
gin point; end point)-pairs of all possible non-empty walks in the graphical 
representation of Rtg (see e.g. [18,7] for a formal definition of transitive 
closure) . 

So, for each table name if, the following subset Nts of v{E) is the if-table in 
the intended new state: 

NtE = {t G v{E) I (if; t) ^ Delset and -•By G Delset : ((if; i); y) G Tcl{Rtg)} . 

But also with cascading deletes, the intended new state might not be allowed, 
e.g., due to other static or dynamic constraints. In that case the database should 
remain completely unchanged. This calls again for our adaptation operation of 
Section 2. Thus, if / denotes the function assigning to each state v of the database 
universe U the intended new state just described, then Ad{U, R, /) would be the 
actual transaction on U (and within the given transition relation R). 

3.3 Formal Model of Cascading Deletes Results 

We are now ready for the formalization of these ideas. First we define the noti- 
ons of a foreign key dependency and a reference graph. Informally, a foreign key 
dependency consists of two table names E and E' and an attribute transforma- 
tion h such that the inclusion dependency “E.dom{h) < E' ,rng{h)'" holds on U 
and that rng(h) is a key of E' in U (see the appendix for the terminology and 
notations used): 
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Definition 5. IfU is a DB universe over a DB schema g, then: 

{E; E'] h) is a foreign key dependency in U 

E € dom{g) and E' € dom{g) and 
h is a function and rng{h) is a key of E' in U and 
\/v € U : h connects v{E) with v{E'). 

If [E] E'] h) is a foreign key dependency in U then dom{h) is called a foreign key 
in the literature. 

Definition 6. IfU is a DB universe, then: 

G is a reference graph on U G is a set of foreign key dependencies in U. 

Figure 1 augmented with a labelling of each arrow with the proper underly- 
ing attribute transformation h would constitute a representation of a reference 
graph. We note that it is possible that E = E' (for instance, a reference to the 
manager of an employee) or that there are two different arrows from a given ta- 
ble symbol if to a given table symbol E' (for instance, references to the product 
as well as to the part in a bill of material) . Note that in the latter case the two 
arrows must have different labels. In Definition 6 we deliberately talk about a 
reference graph on U and not necessarily about the reference graph of all foreign 
key dependencies in U. Our intention (from Definition 7 on) is to concentrate on 
the set of all foreign key dependencies that have to be cascading. For a reference 
graph G on a database universe U the following (auxiliary) function Gt{U, G) 
assigns to each database state v of G the graph of referencing tuples according 
to G: 

Definition 7. IfU is a DB universe and G is a reference graph on U, then: 

Gt{U, G) = Xv G U : {{{E; t); {E'; t')) \ {E; E'; h) G G and t € v{E) and 

t' € v(E') and 1 1" dom{h) = t' o h}. 

So, GtiU, G){v) denotes the graph of referencing tuples in DB state v (according 
to G). Hence, 

(a) Gt{U,G){v) represents our former Rtg. 

During the informal sketch in Section 3.2 we already noted that the originally 
intended deletes need not be restricted to just one table symbol. In our next 
definition we will represent the originally intended deletes by means of a function 
Q over U such that Q{v) assigns to each relation symbol E the set of E-tuples 
to be deleted initially. Therefore, 

(b) l+JQ(u) represents our former Delset and 

(c) t G Q{v){E) is equivalent to (E;t) G Delset. 

Gasdelint{U, G, Q) will denote the function that assigns to each DB state v the 
intended new state. So, Gasdelint{U,G,Q){v){E) has to represent our former 
subset NtE of v{E). Rewriting NIe using (c), (b), and (a) above will lead to 
the following definition of CasdelintiU, G, Q). 
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Definition 8. IfU is a DB universe and G is a reference graph on U and Q is a 
function over U such that \/v €U : Q{v) is a set-valued function over Head{U), 
then: 

CasdelintfU, G,Q) = Xv € U : XE G dom{v) : 

{t G v{E) - Q{v){E) h3y G 1+J Q{v) : {{E-, t);y) e Tcl{Gt{lA, G)(^;))}. 

We can prove that the function Gasdelint{U, G, Q) preserves not only all attri- 
bute, tuple, and key constraints (like all deletes do), but also that the function 
G asdelintilA , G, Q) preserves all foreign key dependencies that occur in G, by 
which we mean that if v' = Gasdelint{U , G, Q)(v) with v and {E-, E'; h) G G, 
then TTdom(h){v' {E)) C v'{E')ooh and, of course, rng{h) is u.i. in v'{E'). 

Theorem 2. IfU is a DB universe and G is a reference graph on U and Q is a 
function overU such that Vw G C/ : Q{v) is a set-valued function over HeadiU), 
then: 

G asdelint{U , G, Q) preserves all attribute, tuple, and key constraints and 
all foreign key dependencies in G. 

Nevertheless, as we already noted above, the intended new state might not be 
allowed, e.g., due to other constraints. In that case the database should remain 
completely unchanged. This calls again for our adaptation operation of Section 2. 
Thus, with / = Gasdelint{U,G,Q), the adaptation Ad{U,R,f) of / would be 
the actual transaction on U (and within a given transition relation R). We will 
denote this transaction by Gasdel{U, R, G, Q)\ 

Definition 9. If U is a DB universe, R C U x U , G is a reference graph on 
U, and Q is a function over U such that Vv G C/ : Q{v) is a set-valued function 
over Head{U), then: 

GasdeliU, R, G, Q) = Ad{U, R, CasdelintifA, G, Q)). 

This completes our formalization of the result of cascading deletes. We note that 
the “restricted” (or non-cascading) delete operation Del{U,R,E,q) in [7] is a 
special case of the delete operation of Definition 9 above, namely if there are no 
cascading foreign key dependencies and, moreover, Q only mentions E-tuples: 
Take G = 0, Q{v){E) = q{v), and Q{v){E') = 0 for each E' yf E. 

With Definition 9 we have a new standard form of adaptation which is much 
more “active” than the first one (in Definition 3). Nevertheless we could give a 
declarative specification of this complex transaction. This opens the way to give 
declarative semantics for more generally “active” databases as well. 

4 Conclusions and Future Work 

While specifications of queries usually are of a declarative nature, specifications 
of transactions mainly are of an operational and descriptive nature. Especially 
descriptions of complex transactions (such as cascading deletes) tend to be very 
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operational. Declarative specifications of transactions usually suffer from the 
so-called frame problem or do not have a clear semantics. 

Often these descriptions turn out to be nondeterministic as well. This paper 
offers a general framework for declarative specifications of transactions, including 
complex ones. We also take the influence of static and dynamic constraints on 
the alleged transactions into account. Applications of our theory included in this 
paper are the declarative specification of cascading deletes and the distinction 
between allowable and available transitions. 

Our plans for further research in the area of declarative transaction specifica- 
tion include the incorporation of the declarative semantics of active databases, 
the design of trigger generators, and the treatment of databases as algebra’s 
(consisting of a state space, a transition relation, a collection of transactions, a 
collection of queries, etc.). 
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Appendix: Basic No(ta)tions 

In this appendix we establish the basic notions and notations as we use them 
in this paper (see also [7]). We suppose that the reader is familiar with the 
notions of sets and functions (which are special sets of ordered pairs), and the 
fact that functions can be set-valued or even function- valued. Given a set A, 
we will use the notation “Ax G A and Cx '■ Ux" (where Ux represents some 
expression in x and Cx some condition for x) as an abbreviation for the function 
“{(x;Ma;) |x G A and Ca,}”. Since functions can be function-valued, A’s can be 
nested. We denote the domain of a function / (i.e., the set of all actual first 
coordinates of /) by dom{f), the range of a function / (i.e., the set of all second 
coordinates of /) by rng(f), function composition by go/ (g after /), functional 
overriding, where the function g “modifies” and/or “extends” the function / 
(see [17]), by fog, the identity function on a set A by id{A), and the set of all 
functions from a set A into a set B by A ^ B. Thus: 

dom{f) = {x|(x;g) G /} 
rngif) = {g|(x;g) G /} 

g o / = Ax G dom{f) and /(x) G dom{g) : g(/(x)) 

/ 6< g = {(x; g) G / I X ^ dom{g)} U g 
id{A) = Ax G A : X 

A — >• B = {/ I / is a function and dom{f) = A and rng{f) C B} 

Note that g o / is a function over {x G dom{f) \ /(x) G dom{g)'\ and f 9 g is a 
function over dom{f) U dom{g). 
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We consider a table over a set A as a set of functions over A, i.e., functions 
with domain A. If A is a set, then: 

T is a table over A T is a set and Vt G T : t is a function overA. 

An element of a table T is called a tuple and an element of A is called an attribute 
of T. So, tuples are considered as functions assigning values to attributes. Since 
every table is a (special) set, every concept defined for sets also applies to tables. 
Thus, for example, the notions of union and intersection of two tables make 
sense. Similarly, since every tuple is a (special) function, every concept defined 
for functions also applies to tuples. For example, dom{t), the domain of a tuple 
t, is the set of attributes of t. 

By a database schema (or briefly DB schema) we mean a set- valued function, 
assigning to each table name its set of attributes. As a frame of reference we pre- 
sent Date’s well-known example in the database literature concerning suppliers, 
parts, and shipments (cf. [5]). The suppliers/parts/shipments-example has the 
following DB schema, which we will call go'. 

go = { {S; {S'#, SNAME, STATUS, CITY}), \ suppliers 

(P; {P#, PNAME, COLOR, WEIGHT, CITY}), \ parts 
(SP; (S#, P#, QTT})} I shipments 

Here S# stands for supplier number, P# for part number, and QTY for quantity. 
Since every database schema is a function, every concept and notation defined for 
functions also applies to database schemas. For instance, we can speak about the 
domain of a database schema, which happens to be the set of table names (also 
called relation symbols). Note that in our example above, dom{go) = {S, P, SP} 
and, e.g., go{SP) = (S'#, P#, QTY}. We define the concept of a database state 
(or briefly DB state) over g for any DB schema g: 

If (/ is a database schema, then: 

u is a DB state over g ^ v is a function over dom{g) and 

VP G dom{g) : v{E) is a table over g{E). 

Since every database state is a function, every concept defined for functions also 
applies to database states. The set of admissible states (to be determined by the 
organization in question) is some set of database states over g^. We call such a 
set a database universe (or briefly DB universe) over g^ . In general we define: 
If (/ is a database schema, then: 

U is a DB universe over g is a set of DB states over g. 

Example 1 in Section 2 contains the specification of a DB universe called EXU. 
Since every DB universe is a (special) set, each concept defined for sets applies 
to DB universes as well. If ZY is a DB universe over g, then we call g the DB 
schema ofU, dom{g) the heading of U, also denoted by Head{U), an element E 
of dom{g) a table symbol (or “table name”, or “relation symbol”) ofU, g{E) the 
heading of E in U, and an element of g{E) an attribute (or “attribute name”) 
of E in 14. 
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An element of the Cartesian product U y. U is called a transition within 
U. Dynamic constraints can be captured by establishing the set of admissible 
(direct) transitions as a subset R oiU xU, having the intuitive meaning that 
{v]v') G i? iff {v,v') is an allowed consecutive state pair, i.e., iff the direct 
transition from state v to state v' is allowed. (See [8] for an in-depth treatment.) 
If W is a set, then: 

i? is a transition relation on U ^ R xU. 

Usually, a transition relation R has to be reflexive on U, i.e., Vw G ZY: (v; v) G R, 
in order to account for “dummy” transitions, e.g., when we try to delete some 
non-existing tuple. The restriction of a tuple t to an attribute set B is denoted 
hy t\B, the projection of a table T on B is denoted by ttb{T), and the renaming 
of a table T by an attribute transformation h, i.e., a function that assigns to 
each “new” attribute the “old” attribute it replaces, is denoted hy T oo h: 

t \ B = {{a;w) \ (a;w) € t and a G B} 

7TB{T) = {t\B\t€T} 

T oo h = {t o h \ t G T} 

We define the familiar notion of uniqueness on the “incidental” level of tables 
(where we will talk about uniquely identifying) as well as on the “structural” 
level of database universes (where we will talk about keys or superkeys). 

If A and B are sets and T is a table over A, then: 

B is uniquely identifying (or u.i.) in T 

^ytGT -.yt' GT : iit\B = t'\B then t = t' . 

If 5 is a DB schema, U is a DB universe over g, and E G dom{g), then: 

i? is a (super)key of E in 14 ^ \/v G 14 : B is u.i. in v{E). 

The following notion constitutes a generalization of the notions of referential 
integrity and of inclusion dependency. Let h be a function which maps a set B of 
“referencing” attributes of a table T onto a set B' of corresponding “referenced” 
attributes of a table T' . Thus, the “attribute renaming function” or “attribute 
transformation” h indicates which attributes in B correspond to which attributes 
in B' . We say that h connects T with T' iff, informally speaking, all B-values in 
T also occur as B'-values in T' . Formally: 

If T is a table over A, T' is a table over A', and h is a function over A and 
rng{h) C A' , then: 

h connects T with T' ’^dom{h){T) QT' ooh. 

For the special case that h connects v{E) with v{E') for each DB state v of 
a DB universe U, we have a so-called inclusion dependency on U, sometimes 
written as “E.dom{h) < E' .rng{hY’ (see [9]). In our more general definition, 
however, T and T' might also be subsets of v{E) and v{E'); another important 
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special case is that T = v{E) and T' = v'{E), i.e. considering the same table 
symbol at two different “points in time”. Often, h = id{B) for some attribute 
set B. The connection requirement then reduces to: 7rs(r) C ttb{T'). 

In our examples, we will use N to denote the set of all natural numbers 
(including 0). If S' is a set of sets then IJ S denotes the generalized union of S: 

|JS={a;|3^GS:a;G A}. 

For each set-valued function E, H(^) denotes the generalized product of F and 
1+J F denotes the generalized disjoint union of F. Formally: 

Il(r) = {/ I / is a function over dom{F) and Vx G dom{f) : f{x) G F(a:)}; 
y F = {(a;; y) I a: G dom{F) and y G F(x)}. 
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Abstract. The two-phase commit protocol is combined with the strict 
two-phase locking protocol as means for ensuring atomicity and serializ- 
ability of transactions. The implication of this combination on the length 
of time a transaction may holding locks on various data items might be 
severe. There are certain classes of applications where it is known that re- 
sources acquired within a transaction can be “released early” , rather than 
having to wait until the transaction terminates. Furthermore, there are 
applications involving heterogeneous competing business organizations, 
which do not allow to block their resources; therefore, the preservation of 
local autonomy of individual systems is crucial. This paper describes an 
extension of the OMG’s Object Transaction Service, by adding the “open 
nested transaction model”, which greatly improves transaction paralle- 
lism by releasing the nested transaction locks at the nested transaction 
commit time. Open nested transactions relax the isolation property by 
allowing the effects of the committed nested transaction to be visible to 
concurrent transactions. We also describe how we take benefit of this 
model using the proposed Asynchronous Nested Transaction model to 
overcome the limits of the current messaging products and standard spe- 
cifications when they are confronted with the problem of guaranteeing 
the atomicity of distributed multi-tier transactional applications. 



1 Introduction 

The concept of a transaction has been developed to permit management of ac- 
tivities and resources in a reliable computing environment. Indeed, transactions 
are useful to guarantee consistency of applications even in case of failure and in 
the case of conflicting concurrent applications. The traditional flat transaction 
model, proposed by the OMG (Object Management Group) Object Transaction 
Service (OTS) [20], although suitable for applications using short transactions, 
may not provide enough flexibility and performance when used for more complex 
applications, such as GAD applications, connection establishment in telecommu- 
nication, or business travel including several servers on different sites and need 
access to many resources involved within a relatively long-lived transaction. Ty- 
pically, the two-phase commit (2PG) protocol [7] is combined with the strict 
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two-phase locking protocol [3] to guarantee atomicity and serializability of tran- 
sactions. The implication of this combination on the length of time a transaction 
may holding locks on various data items might be severe. At each site, and for 
each transaction, locks must be held until either a commit or an abort message 
is received from the coordinator of the 2PC protocol. Since the 2PC protocol is 
a blocking protocol, the length of time these locks are held can be unbounded. 
There are certain classes of applications where it is known that resources ac- 
quired within a transaction can be “released early” , rather than having to wait 
until the transaction terminates. These applications share a common feature 
that application-level consistency is maintained, despite any non- ACID behavior 
they may exhibit. For some applications, failures do not result in application- 
level inconsistency, and no form of compensation is required. However, for other 
applications, some form of compensation may be required to restore the system 
to a consistent state from which it can then continue to operate. 

Moreover, the impact of indefinite blocking and long-duration delays is exa- 
cerbated in distributed systems where heterogeneous domains or database sy- 
stems are integrated to enable the processing of multi-site or global transactions. 
The integrated systems may belong to distinct and possibly competing business 
organizations (e.g. competing computerized reservation agencies). Therefore, the 
preservation of local autonomy of individual systems is crucial. It is undesirable, 
for example, to use a protocol where a site belonging to a competing organization 
can harmfully block the local resources; a phenomenon that can occur under the 
2PC protocol. Although such applications can perhaps be implemented using 
traditional transaction systems, currently application programmers are required 
to build application specific mechanisms to do this (such as create mechanisms 
for saving application state, create ad hoc locking mechanisms, create mecha- 
nisms for compensating transactions and so forth). CORBA functionality for 
supporting flexible ways of composing an application using transactions, with 
the support for enabling the application to possess some or all ACID properties, 
will greatly reduce the burden on application builders. 

In this paper we describe an extension to the OTS by adding the open nested 
transaction model, a generalization of Sagas [5], which introduces a two-level 
hierarchy between saga and its child transactions, and multilevel transactions 
[24], which impose a strict layered hierarchy of subtransactions capable of pre- 
serving serializability. 

The open nested transaction (ONT) model greatly improves transaction par- 
allelism by releasing the nested transaction locks at the nested transaction com- 
mit time. That is, open nested transactions relax the isolation property by allo- 
wing the effects of the committed nested transaction to be visible to concurrent 
transactions, thus waiving the locks transfer rule of closed nested transactions 
(CNT). To design this model, all functionalities provided by the current OTS 
specification are fully used in the sense that the commitment protocol and the 
nested structuring exist. 

The remainder of this paper is organized as follows. Section 2 gives an over- 
view of related works. Section 3 presents our transaction model built on the 
concept of open and closed nested transactions. Section 4 describes the experi- 
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mentation of our transaction model in a conformant OTS implementation named 
MAAO OTS, developed by INRIA. The extension we propose to the interfaces 
provided by OTS are based on the prototype named extended OTS (XTS), 
which has been prototyped in the scope of the ReTINA project [6]. Section 5 
describes how we can guarantee the atomicity of a distributed work unit by inte- 
grating the queueing transaction model with the open nested transaction model. 
Finally, Section 6 concludes the paper. 



2 Related Work 

Several enhancements to the traditional flat-transaction model have been propo- 
sed by relaxing the conventional ACID properties. In the literature, the proposed 
solutions are described as advanced transaction models [8,18]. Each of these mo- 
dels is well suited for a special class of applications, but none of them seems to 
be suitable and applicable for all kind of complex and long-lived applications. 

By allowing nesting of transactions [17], OTS supports a finer control over 
recovery and concurrency. In particular, nested transactions (subtransactions) 
could be executed concurrently. The outermost transaction of such a hierarchy 
is typically referred to as the top-level transaction. Unlike top-level transactions, 
the commit of a subtransaction is provisional upon the commit/rollback of the 
enclosing transaction. Hence, the failure of a subtransaction does not necessarily 
leads to the failure of its enclosing transaction. Resource objects acquired wit- 
hin a subtransaction are inherited (retained) by parent transactions upon the 
commit of the subtransaction, and (assuming no failures) only released when 
the top-level transaction completes, i.e., they are retained for the duration of 
the top-level transaction. Thus, although subtransactions provide increased fle- 
xibility in the construction of transactional applications, they do so within the 
context of the ACID properties of transactions. We refer as a “closed nested 
transaction model”. 

Some database vendors provide various kinds of isolation levels (Read Un- 
committed, Read Committed, Repeatable Read, Serializable). The concept of 
isolation level has been specified by the ANSI/SQL-92 [2] and re-used by the 
JDBC [II] and the Imprise Submission [10] in response to the CMC Persistence 
Service Request For Proposal. Lower isolation levels increase transaction con- 
currency at the risk of allowing transactions to observe an incorrect state of 
data. Although offering better performance at the cost of relaxed isolation, this 
flexibility does not guarantee atomicity within distributed transaction requiring 
a higher isolation level. 

Another approach to handling long-running activities is to have each step 
run as a transaction. Thus, an activity consists of multiple transactions. To deal 
with failures or exceptions across the steps of an activity, several models have 
been proposed [23,11]. These models support declarative specification of the 
control flow and an automatic compensation capability that offers some level of 
failure atomicity for the activity. These models are based on the conventional 
flat transaction model in which transactions are strictly sequential. 
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The concept of compensation has also been used in several transactional 
workflow systems which can be used to provide scripting facilities for expres- 
sing the composition of an activity (a business process) out of other activities 
(which could be transactional), with specific compensation activities [18]. The 
workflow specification adopted by the OMG [19], based on the Workflow Ma- 
nagement Coalition [25], specifies interaction between activities belonging to a 
same business process or interaction between business processes, and logging 
(auditLog) of each activity or process execution which can usually be used for 
compensation, though the specification does not describe how the compensation 
is triggered. Transactional activities are in fact independent of flat transactions. 
Their corresponding transactions coordinators are independents from the enclo- 
sing coordinator of the enclosing process or activity; that is a final issue of an 
activity is known by query its corresponding auditLog and a transaction abort 
of an enclosing activity cannot be propagated to its sub-activities since they 
are executed in separate transactions. Hence, it seems left to the application 
to be aware of the transaction outcome (rollback outcome) and decide to even- 
tually propagate the rollback to other activities and compensate those already 
committed [1]. 

Although the workflow management systems could be a more appropriate 
framework to address advanced applications [1] by a set of dependencies between 
activities, they lead to an overhead of messages in order to keep a workflow 
context across several participants. In turn this overhead generates a delay which 
can be detrimental for some applications looking for performance. 

Another possibility to deal with competing organizations and long-running 
transactions is the use of asynchronous communication. Asynchrony exalts time 
independent processing [16], which in turn favors the parallelism. One way to 
support asynchrony between applications is to use message queuing systems. 
The solutions adopted by the major message queuing products vendors rely on 
the use of the queuing transaction model (also called off-line transaction model 
[8]). In this model, the message producer sends its message to one transaction; 
the message queuing mechanism delivers the message in an other transaction 
only if the first one commits. 

IBM MQSeries [9] and BE A Tuxedo/Q [22] are product leaders in the field of 
messaging. More recently, OMG has adopted a new specification, called GORBA 
messaging [4]. This specification aims at an asynchronous operation invocation, 
using message passing. The OMG approach is to place the required changes at 
the Object Request Broker level, which implies a revision and changes to several 
parts of its architecture, mainly to the GIOP protocol. 

The current products as well as the ongoing GORBA messaging specification 
do not address the issue of the atomicity of a work expected by a client in its 
initial transaction involving several servers. Indeed, if a client invokes several 
different remote servers using message queueing mechanism, each server will 
execute the message request in a separate transaction. Hence, each server tran- 
saction may have an independent outcome and the initial atomicity the client 
asked for may be lost. 
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3 Nested Transaction Models 

Our transaction model is based on the concept of nested transactions as intro- 
duced by Moss [17,14,15]- In this model a transaction may contain any number 
of nested transactions which may recursively contain other nested transactions 
giving rise to a tree of nested transactions or transaction family. In addition, we 
distinguish two nested models: closed nested and open nested transactions. 



Closed Nested Features 

Closed nested transactions exhibit at least two important advantages over flat 
transactions. First, they allow the potential internal consistent parallelism to be 
exploited. Second, they provide finer control over failures by limiting the effects 
of failures to a small part of the global transaction. These properties are achieved 
by allowing nested transactions within a given transaction to fail independently 
of their invoking transaction. 

Changes made by a closed nested transaction, when it is “committed”, re- 
main contingent upon commitment of all of its ancestors. But, since a committed 
closed nested transaction may be rolled back later, the nested transaction mo- 
del achieves serializability by requiring nested transactions to transfer all their 
acquired resources (e.g., locks) during their execution to their parent when they 
commit. 

The closed nested transaction commit is not a two-phase commit protocol 
(2PC) but only relevant notifications (e.g., end of work related to the nested 
transaction and synchronization) are sent to the participants to allow the nested 
transaction’s parent to acquire their locks. We call this protocol as the “finish 
protocol” . 



Open Nested Features 

The open nested transaction model is different from the closed nested transac- 
tion model in the following aspects: The open nested transaction model relaxes 
isolation by allowing the effects of a committed subtransaction to be visible to 
other concurrent transactions and thus avoiding the transfer of locks to the pa- 
rent as done in the closed nested model. On the other hand, a closed nested 
transaction extends the sphere of control of its parent so that its effects are still 
isolated against concurrent transactions. 

Since the locks acquired during the execution of an open nested transaction 
are released at commit, the dependency induced by a parent transaction on its 
open nested transaction cannot rely on traditional rollback as done for closed 
nested transaction. When an ancestor rolls back, undo of a committed open 
nested transaction has to be performed by a compensating activity. We note that 
not all transactions are compensable. Transactions involving some real actions 
such as firing a missile or dispensing cash, may not be compensable. A formal 
approach to recovery by compensating transactions can be found in [13]. 
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3.1 Transaction Tree 

Each transaction or nested transaction may involve one or several transaction 
node. A transaction node can be either a root or a subordinate node. Moreover, a 
transaction node or simply a node may also be a child or a parent node depending 
on the parent transaction relationship as illustrated in Fig. 1. 

3.2 Transaction and Nested Transactions States 

At a given time, each node is characterized by a state as illustrated in Fig. 2, 
which describes significant transaction and subtransaction state transitions. The 
different paths correspond to the executions of different types of actions. 

When a transaction or a subtransaction is started, its node is created in the 
ACTIVE state. A subordinate node changes from the ACTIVE state into the 
PREPARED state once it has received the two-phase commit first request, or the 
prepare request. Then from the PREPARED state it moves either into COM- 
MITTING or ROLLBACKING state depending on the second two-phase commit 
request, which is either a commit request or a rollback request. 

A root node never goes in the PREPARED state. Once it has received a 
ready reply from all its subordinates, it decides to propagate commit to all 
its subordinates and then moves into the COMMITTING state. When comple- 
tely committed a top-level node participant move from the COMMITING state 




► Superior/Subordinate Relationship 



► Parent/Child Relationship 



Transaction/Nested Transaction Node 



Fig. 1. Transaction Tree 
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Fig. 2. Transaction and Subtransaction States 



the INEXISTENT state. After the rollback, any node moves from the ROLLB- 
ACKING state into the INEXISTENT state. 

In the same way, as for a top-level transaction a root node of nested transac- 
tion may move from the ACTIVE state into the COMMITTING State. However 
if a compensating action has been specified for its nested transaction, this node 
move, when completed, from the COMMITTING state into the COMPENSA- 
TING state in order to receive its ancestors outcome. That is any rollback will 
notify this node that the compensating action needs to be triggered. 

When terminated with the closed nested semantic, a node moves from the 
ACTIVE state into the FINISHED state. Then it becomes a subordinate of its 
parent’s transaction, it can either remain in the FINISHED state, if its parent 
transaction is a nested transaction terminated with the closed semantic, or it 
moves into the PREPARED state, if its parent is the top-level transaction or a 
nested transaction terminated with open nested semantics. 

4 Experimenting the ONT Model in a OMG Conformant 
Implementation 

4.1 MaaoOTS Implementation of OTS 

The “Moniteur d’Actions Atomiques Ouvertes” (MAAO) OTS developed by 
INRIA [12] is an implementation of the Object Transaction Service (OTS) as 
defined by the Object Management Group (OMG) [20]. 
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MAAO OTS supports three distributed transaction models: the flat transac- 
tion model, the optional nested model we referred as the closed nested model 
and the open nested transaction model described in the previous section. 



SQL 




Fig. 3. MAAO OTS Architecture 



Globally, MAAO OTS illustrated in Fig. 3 consists of two major parts: 

— MAAO OTS Transaction Manager (TM), is a transaction manager which 
provides OTS interfaces and manages transactions. 

— MAAO OTS Library, is a library linked within applications. It provides 
OTS Resource interface to enable application to access legacy database and 
to enable hidden legacy database to participate in the transaction managed 
by the MAAO OTS TM. 

MAAO OTS provides transaction management services and a transaction pro- 
pagation protocol by a set of well-defined interfaces. These interfaces, described 
in the OTS specification, are briefly described below: 

— Current defines operations that allow a client of the Transaction Service 
to explicitly manage the association between threads and transactions. The 
Current interface also defines operations that simplify the use of the Tran- 
saction Service for most applications. These operations can be used to begin 
and commit /rollback transactions or nested transactions and to obtain in- 
formation about the current transaction. 

— TransactionFactory is provided to allow the transaction originator to begin a 
transaction. This interface defines two operations, create and recreate, which 
create a new representation of a top-level transaction. 
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— Control allows a program to explicitly manage or propagate a transaction 
context. An object supporting the Control interface is implicitly associated 
with one specific transaction and provides two operations, geCterminator 
and geCcoordinator, which respectively a Terminator object and a Coordi- 
nator object. 

— Terminator supports operations to terminate commit and rollback to com- 
plete a transaction. 

— Coordinator provides operations that are used by participants in a transac- 
tion to query the transaction about its status and relationship with other 
transactions, e.g., geCstatus, issameCransaction, ismelatedCransaction, or 
is-top-level-transaction. The register_resource operation allows a recoverable 
object to register a resource as a participant in the transaction. If the tran- 
saction is a nested, the resource is implicitly registered with the top-level 
transaction. When the top-level commits, the Resource objects will partici- 
pate in the two-phase commit protocol to commit or rollback the updates 
performed as part of the transaction family. The registersuhtran-aware ope- 
ration registers a resource with a subtransaction such that it will be notified 
when the subtransaction commits or rolls back. The createsuhtransaction 
operation creates a child transaction of the current transaction and returns 
a Control object of the new subtransaction. The Coordinator object is a key 
component to expend a transaction and control the transaction completion 
between distributed participants. 

— Recovery Coordinator provides the replay -completion operation invoked by a 
recoverable object to determine the state of the transaction after the asso- 
ciated resource has been prepared. A reference to the RecoveryCoordinator 
object is returned by the register-resource operation to recover from failure. 

— Resource defines operations invoked by the OTS to participate in the two- 
phase commit protocol. The operations supported by the Resource interface 
are very similar to the ones defined by the X/Open XA interface, namely 
prepare, rollback, commit, commit-one -phase, and forget. 

— SubtransactionResourceAware is a specialization of the Resource interface 
that supports subtransactions. This interface provides two operations, com- 
mit-subtransaction and rollbacksubtransaction, invoked by the OTS to no- 
tify a Resource object about the completion of a subtransaction. 

— Trans actionalObject is a special interface to indicate the transaction quality 
of service. By supporting this interface an object implicitly requires that the 
transactional context associated with the current thread be propagated on 
remote invocations. 

The CORBA architecture provides access transparency and location transpa- 
rency. The interactions between the Transaction Service and application ob- 
jects are all performed through the ORB. This feature implies that the type of 
transaction (local or distributed) is transparent to the Transaction Service. All 
transactions are processed in the same way. 
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4.2 Interposition 

The OTS specification defines a technique, named interposition, which allows 
multiple Transaction Services to cooperate in order to support a global transac- 
tion. When a transaction is spanned onto a new OTS domain, the transaction 
context is exported and used by the importing Transaction Service implementa- 
tion to create a new instance of a Transaction Service object. 

In the case of interposition, since the interposed Transaction Service regi- 
sters with the superior as a Resource object, the interposition is invisible to the 
superior Transaction Service. The superior Transaction Service propagates the 
transaction semantics to its subordinate without being aware of whether it is a 
real Resource or a subordinate Transaction Service. 



4.3 Open Nested Transactions Extension: The User’s View 

MAAO OTS extends the OTS by allowing any combination of open and closed 
nested transactions. Open nested transactions behave like closed nested tran- 
sactions while they are active: the behavior of the two models only differ in the 
commit semantics. MAAO OTS allows the programmer to procrastinate the de- 
cision of the nesting semantics until the commit decision. Then, to differentiate 
the open semantic from the closed semantics, MAAO OTS provides a new ope- 
ration, named definite-Commit, which has been added to both interfaces Current 
and Terminator. 

The definite-Commit operation informs the Transaction Service to definitely 
commit a nested transaction thereby implying the open semantics for that ne- 
sted transaction. In contrast to the commit operation, definite -commit forces a 
definitive commit by triggering the two-phase commit protocol for the nested 
transaction so that its effects are made permanent and visible to others tran- 
sactions. All locks acquired by the nested transaction and by its possible closed 
nested descendants transactions are released. 

The IDL definition part dealing with the open nested management is descri- 
bed below. 

module CosTransactions { 

interface Terminator { 

void commit (in boolean reportJieuristics) 

raises (HeuristicMixed, HeuristicHazard) ; 
void rollback 0 ; 

void def inite_commit (in boolean reportJieuristics, 
in Compensator c, in euiy data) 
raises (NoTransaction, HeuristicMixed, HeuristicHazard); 

}; 

interface Current : CORBA: : Current { 
void begin 0 

raises (SubtransactionsUnavailable) ; 
void commit (in boolean reportJieuristics) 

raises (NoTransaction, HeuristicMixed, HeuristicHazard); 




Open Nested Transactions in Multi-tier Applications 



177 



void def inite_commit (in boolean report Jheuristics , 
in Compensator c, in any data) 
raises (NoTransaction, HeuristicMixed, HeuristicHazard) ; 
void rollback () ; 



} 

interface Compensator { 

void compensate(in any data); 

}; 

}; 

Compensation activities are typically for specific applications (and specific ope- 
rations). If compensation activities are required, it may be more efficient to 
exploit application-level semantics and allow the application programmer to in- 
itiate compensation behavior, rather than rely upon it being system-driven. Ho- 
wever, the trigger of a compensation should be done automatically by the Tran- 
saction Service. This means that a generic operation should be well known by 
the Transaction Service which it can use. 

A new interface, named Compensator, offers the compensate operation which 
is invoked by a Transaction Service to compensate a committed open nested tran- 
saction. If the behavior of the compensate operation is application-dependent, 
then its corresponding method is implemented by the application itself. 

In order to inform the Transaction Service of a nested transaction about 
the existence of a Compensator object, this one is given as parameter in the 
definite -Commit operation. If a nil Compensator object is passed in the defi- 
nitc-commit operation, then the nested transaction is invoked to commit with 
open semantic, but no compensation is required if an ancestor transaction aborts. 

The data parameter wrapped in the CORBAr.Any format is application- 
dependent. That is, an application can “cast” its own data to this generic format. 
It is passed to the Transaction Service using the definite -commit operation, 
where it is used to invoke the Compensator object, which “recast” it to its 
original format. In fact, the Transaction Service is not aware of the real nature 
of this data. 

The data parameter can be nil meaning that the compensation action does 
not need any data. The Compensator object in the operation can be nil meaning 
that a compensating action is not needed if an ancestor aborts. 

The following example illustrates how a nested transaction can be terminated 
with open semantics using the Current object: 



current->begin ( ) 
obj->opl 0 ; 
curr ent->begin ( ) 

obj->op2() ; 



// begin a top level transaction 1 
// the opl is called within the transaction 1 
// begin a nested transaction 1.1 whose 
// parent is the transaction 1 
// the op2 is called within the nested 
// transaction 1.1 

current->def inite_commit (cp, param_cp) ; // commit transaction 1.1 
current->commit () ; // commit the top level tremsaction 1 
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4.4 Implementation Issues 

When invoked to initiate a transaction with the create operation, the Transac- 
tionFactory object creates a Transaction Service node (TS) dedicated for the 
top-level transaction. This node offers the interfaces Control, Coordinator, and 
Terminator to the programmer in order to manage the top-level transaction. 
When a Transaction Service node is invoked through its Coordinator interface 
to create a nested transaction, a new Transaction Service node (nesTS) is crea- 
ted, which in turn provides the same interfaces to the programmer as its parent 
creator to manage the created nested transactions. Fig. 4 illustrates the creation 
of a Transaction Service node. 




Transaction Server 



Fig. 4. Transaction Service Node Creation 



During a (sub)transaction’s lifetime. Resource objects may be registered to the 
Transaction Service associated with this (sub). Fig. 5 describes the transaction 
tree involving Transaction Service nodes and their associated Resource objects. 
In the remainder, we will simplify the Transaction Service node to the Coor- 
dinator. The transaction specification requires that certain protocols are used 
to implement the atomicity property. These protocols affect the implementa- 
tion of recoverable servers (recoverable objects that register for participation in 
the two-phase commit process) and of the coordinators that are created by a 
transaction factory. These responsibilities ensure the execution of the two-phase 
commit protocol and include maintaining state information in stable storage, so 
that transactions can be completed in case of failures. 

The first coordinator, referred as the root coordinator and created for a top- 
level transaction, is responsible for executing the two-phase commit protocol 
for the top-level transaction. Any coordinator that is subsequently created for 
an existing transaction becomes either a nested coordinator if it is created as 
the result of the operation create subtrans action on the parent coordinator (as 
described in Fig. 4), or an interposed or subordinate coordinator if it is crea- 
ted as the result of the recreate operation on a TransactionFactory object (2) 






Open Nested Transactions in Multi-tier Applications 



179 




Nesting Level 0 




Fig. 5. Transaction Service and Resource Objects in a Transaction Tree 



(as described in Fig. 6). By registering either a Resource object or a Subtran- 
sactionResourceAware object (3), the interposed coordinator becomes a transac- 
tion/subtransaction participant. In the interposed domain the recoverable server 
can create a nested transaction by invoking the create subtrans action operation 
on interposed coordinator (4). 




Nescoord(i): Nested Coordinator/Root of the level i 

lNes_coor(i); Interposed Nested Coordinator/Subordinate in the nesting level i 



Fig. 6. Nested Transaction and Interposition 



Top-Level Root Coordinator Role. As described in [20], the root coordina- 
tor initiates the two-phase commit protocol when the client asks to commit the 
transaction (with commit operation invoked either on the with Current interface 
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or the Terminator interface). The root coordinator issues the prepare request to 
all registered resources. 

Once at least one registered Resource object has replied VoteCommit and 
all others have replied VoteCommit or VoteReadOnly, a root coordinator may 
decide to commit the transaction by sending a commit request to each registered 
Resource object that responded VoteCommit. 

If any registered Resource object replies VoteRollback or cannot be reached, 
then, the coordinator will decide to rollback and thus inform the registered re- 
sources that already replied VoteCommit. Once a VoteRollback reply is received, 
a coordinator need not send a prepare request to the remaining Resource objects. 
Rollback will be subsequently sent to Resource objects that replied VoteCommit. 



Nested Coordinator Role. A nested transaction can be completed with: 

— Rollback: The application issues rollback on Current or Terminator, which 
allows the invocation of the rollbacksubtransaction operation on each regi- 
stered SubtransactionResourceAware object. 

— Commit with closed semantic: The application issues a commit on Current or 
Terminator. The nested coordinator must notify any registered subtransac- 
tion aware resources of the subtransaction’s commit using the operation com- 
miCsub transaction of the SubtransactionAwareResource interface. When the 
subtransaction is committed and after all registered subtransactions aware 
resources have been notified about the commitment, the subtransaction re- 
gisters any resources registered using register-resource with its parent co- 
ordinator or it may register a subordinate coordinator to relay any future 
requests to the resources. 

— Commit with open semantic: The application issues definite-Commit on Cur- 
rent or Terminator. As a root coordinator the nested coordinator uses the 
two-phase commit protocol to terminate the subtransaction by issuing pre- 
pare and commit messages to all resources or subordinate coordinators regi- 
stered with it as if it was a top-level transaction. If the commitment protocol 
completes successfully and the Compensator object passed as parameter to 
the definitc-Commit operation is not null (meaning that a compensating ac- 
tion is previewed for this open nested transaction), the nested coordinator 
becomes a participant of its parent transaction in such way that it can re- 
ceive a rollback coming from any ancestor and invokes compensate effects of 
its committed open nested transaction as illustrated in Fig. 7. 

In the case of multi-level open nested transactions, the path between the diffe- 
rent nested coordinators should be preserved since each of these coordinators is 
responsible for saving the object reference of the Compensator object needed to 
be invoked if an ancestor rolls back. Note that the natural behavior after the 
two-phase commit protocol completes is to delete first all branches to subor- 
dinates, then all paths to reach them. In order to maintain a branch to reach 
a coordinator having to compensate, a particular notification is needed by any 
superior in a transaction tree. For this aim, a new vote named VoteReadyOpen 
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Fig. 7. Open Nested Termination and Compensation Management 



has been added in reply to the prepare request used in the commitment of an 
open nested transaction, to indicate to the superior coordinator that the branch 
from which this vote has been received need to be kept and to participate to the 
completion of the parent transaction. Fig. 8 shows how the nested coordinator of 
the level n can be maintained level by level until the top-level transaction. Note 
that the vote VoteReadyOpen is only used during the two-phase commitment of 
a nested transaction and never for the top-level commitment. Indeed, there is 
no sense to ask maintaining a branch beyond the commitment of the top-level. 




Fig. 8. Open Nested Termination - Compensator Awareness 



Interposed Coordinator Role. To keep track of any nested transaction com- 
mitted with open semantic in the case of interposition, the interposed coordina- 
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tor will use the vote VoteReadyOpen, as illustrated in Fig. 9, to indicate to the 
superior domain that the path to the interposed need to be maintained until the 
top-level to trigger an eventual compensation if needed. In fact, VoteReadyOpen 
can be used by any subordinate coordinator within a same transaction level to 
indicate that a branch need to be maintained, either because this subordinate 
coordinator manages a Compensator object or it propagates this vote coming 
from its transaction sub-tree. 




Transaction Server 1 Transaction Server 2 



Note: INes_coor(n-l) will partcipate to the completion of (n-2) 

Fig. 9. Open Nested Termination and Interposition 



4.5 Open Nested Transaction and Recovery 

Failures and Logs. The Transaction Service provides atomic outcomes for 
transactions in the presence of transaction failure, or system/communication 
failurel. The technique for implementing transactions in presence of failures 
relies on two-phase commit presumed abort protocol and on the use of logs. The 
coordinator and participant must log certain changes in their state, so if either 
of them fails and subsequently recovers, it can tell what it was at the time of 
failure and take appropriate action. In particular any subordinate should log the 
prepare decision before acknowledging the prepare request from its superior, and 
any coordinator must log the commit decision when it gets all acknowledgements 
( VoteCommit) for its prepare requests. 

The presumed abort assumption permits efficient implementations to be rea- 
lized since the root coordinator does not need to log anything before the commit 
decision and the participants (i.e., resource objects and interposed coordinators) 
do not need to log anything before they prepare. That is any failure occurring 
during an ACTIVE state will cause a rollback: 

1 . A transaction failure occurs when a transaction aborts. The strategy for re- 
covering when a transaction aborts will restore all previous values of all data 
that the transaction wrote. Within this strategy, known as abort recovery. 
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— every active nested transaction is aborted and rolled back, 

— every committed closed nested transaction is rolled back, and, 

— every committed open nested transaction is compensated by invoking 
the compensate operation on the Compensator object. 

2. Concerning the system or communication failures, two types of failure are 
relevant: a failure affecting the object itself due to the server crash (local 
failure) and a failure external to the object (external failure), such as failure 
of another object or failure in the communication with that object. 

An external failure is detected and provided to a calling object when the ORB 
raises the standard exception COMM_FAILURE. The calling object cannot make 
the difference if the exception is due to a communication failure or if the invoked 
object crashed. 



MAAO OTS Recovery Procedures. To ensure atomicity and durability in 
the presence of failure, MAAO OTS provides additional protocols, based on those 
described by the OTS specification, to ensure that transactions, once begun, 
always complete. The approach is to continue the completion protocols at the 
point where the failure occurred, which is referred as recovery. 

When restarted each participant adopts a specific behavior according to its 
position in the transaction tree. 

— A Coordinator in the COMMITTING state has the responsibility for sending 
the commit decision to its registered subordinates. If any registered resources 
exist but cannot be reached, then the Coordinator must try again to send 
the commit decision. 

— A Resource object in the PREPARED state has the responsibility for fin- 
ding out its coordinator decision to either commit or rollback, by invoking 
the Recovery Coordinator with the replay -completion operation to get the 
status of its associated transaction. If the superior coordinator exists but 
cannot be reached (COMM_FAILURE exception returned), then the subor- 
dinate must retry recovery later. If the superior coordinator no longer exists 
(OBJECT_NOT_EXIST exception returned), then the outcome of the tran- 
saction can be presumed to be rollback. 

— A Coordinator in the COMPENSATING state is responsible for determining 
the transaction outcome of its parent coordinator. If the parent coordinator 
no longer exists, the outcome of the parent transaction can be presumed 
to be rollback, then the compensation action of the committed open nested 
transaction is triggered. 

The compensation action may fail. If the OBJECT_NOTJEXIST exception is 
returned when the compensate operation is invoked on the Compensator object, 
which means that is does not longer exist, the standard HeuristicMixed exception 
is returned since a sub-tree has committed while an other has rolled back. If the 
COMM_FAILURE exception returned, the compensate operation need to be re- 
invoked. 
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In order to avoid the waiting for the completion of the compensation action, 
an asynchronous invocation of the compensate operation could be applied via an 
message queueing mechanism. 

5 The Asynchronous Nested Transactions Model 

5.1 Discussion 

The current messaging products and specifications reach their limits when they 
are confronted with the problem of guaranteeing the atomicity of distributed 
multi-tier transactional applications. To illustrate this problem, let us consider 
a simple credit/debit application. Assume that a client application would like 
to make a call to credit an account and debit another on two different banks, 
using the messaging paradigm. It is usually recommended to execute this work 
in an atomic manner. However, using the current products or future CORE A 
messaging products can lead to the following: 

A client starts a transaction, makes two asynchronous calls, credit and debit, 
then commits the transaction. The messaging system will start two different 
transactions, one to debit the count and the other to credit another count. If one 
of them definitely rolls back, the messaging system cannot undo the committed 
transaction. 

Because the client (or the Reply-Handler) is “rollback aware”, it will be 
receive a TRANSACTION_ROLLBACK message. The client (or reply-handler) 
is obliged to take the appropriate compensation action. But it is not always able 
to do this, particularly if the multi-tier servers’ topology is complex. Even if one 
of them can easily re-establish the atomicity by executing the right compensation 
action, it is intolerable to bother them with such things. 

The problem of atomicity in the building of multi-tier transactional appli- 
cation comes from the fact that when the client wants to send several requests 
involving several remote servers in a single Unit of Work, the messaging systems 
execute each request in a separate transaction. Further, the messaging systems 
are not able to coordinate globally the outcome of these separate transactions. 

The independent outcome property destroys the atomicity property. Thus, a 
high-level transaction control is needed that 

1. preserves the asynchronous communication style, 

2. respects the queueing transactional principles, and 

3. minimizes the clients concerns about compensation actions. 

To guarantee these requirements, we propose a new transaction model, called 
the Asynchronous Nested Transaction model. Before describing this model, we 
first define a set of transaction queueing principles based on characteristics of 
the existing products and the coming standards. 

The Transaction Queueing Principles. A queueing system (Fig. 10) is in- 
terposed between clients and servers. Each server and each client owns an input 
queue in the queueing system. 
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Fig. 10. Transaction Queueing Principles 



— Within the scope of a first transaction Tl, a client pushes a message (re- 
quest) to the server queue. The message becomes available, if and only if Tl 
commits. 

— In a second transaction T2, the server pulls its queue and gets a message or 
it waits the queueing system to push the message to it. 

In the both cases, the server processes the message and enters a reply message 
in the client-input queue. When T2 commits, the message is removed from 
the server’s queue. 

— In a third transaction T3, the client takes the reply message by it own or it 
can delegate a Reply -Handler which picks up the message reply and process 
it. The Reply-Handler can be co-allocated with the client application or it 
can be in a separate process. 

If T2 rolls back, the message is scheduled for a next trial according to a retrying 
policy. It is possible that after a retry condition is reached the message is moved 
to an error queue and deleted from the server’s pending messages. The origi- 
nating client or the Reply-Handler receives a TRANSANCTION_ROLLBACK 
message reporting the problem. 

With regard to each transaction, the queueing system handles the messages 
as if it was a database and coordinates the queue manager with the transaction 
outcome commit/rollback. Such a queueing system is called recoverable queueing 
system (RQS). 

Asynchronous Nested Transaction Model. We define an asynchronous 
nested transaction (ANT) as follows: 

1. An ANT is a tree of transactions. The root transaction is the top-level tran- 
saction, the other transactions are open nested transactions. 

2. The top-level transaction contains: 

— QTl, the request message producing subtransaction 
— QT2, the message processing subtransaction, and 
~ QT3, the reply message subtransaction. 

3. The three subtransactions QTl, QT2, and QT3 obey the precedence rule: 



QTl > QT2 and QT2 > QT3. 
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4. Messages exchanged during the ANT’s active phase are hold in a recoverable 
queueing system. 

5. The subtransactions completion are a two-phase commit. 

6. Once a subtransaction has committed and if either a top-level or an ance- 
stor in the tree rolls back, a compensating operation is assumed to undo, 
if necessary, the committed subtransaction effects, due to its lock release at 
completion time. 

The ANT model relies on two main ideas: 

— It assumes the existence of a recoverable queueing system that assures the 
creation of the asynchrony between two consecutive levels. 

— The commit and rollback rules of the ANT model are equal to that of the 
ONT model. This implies that the commit of an asynchronous nested tran- 
saction actually is two-phase commit. As in the ONT model, the rollback 
of a committed subtransaction is performed by a compensation transaction, 
which semantically reverses the effects of the whole ANT sub-tree. 





Fig. 11. The Application View and The Tree Expansion and Control 
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Fig. 11 gives an engineering view of the ANT model implementation. Our first 
example shows the case where a client asynchronously sends a unique request to 
the server. 

The client begins by starting a top-level transaction in which it creates a 
subtransaction QTl in order to send a request to the server. As a result a co- 
ordinator CoordTopL and a subcoordinator SubcoQTl objects are created to 
control the top-level transaction and the subtransaction QTl. The coordinator 
reference as well as the subcoordinator reference are propagated with message 
request. These references are stored in the RQS’s request queue when the QTl 
commits. The RQS dequeues the coordinator object reference and asks the coor- 
dinator to create a subtransaction QT2. The subtransaction QT2 is then tied to 
the top level transaction and managed by the subcoordinator object SubCoQT2, 
which may be either located near the top-level coordinator CoordTopL in the 
same client’s OTS service or located in an interposed OTS domain. 

After creating QT2, RQS dequeues the request and calls the server. The 
server first registers a resource object, then performs the requested work on 
behalf of QT2. Finally it returns the response and the control to the RQS. The 
RQS stores the response in a reply queue and commits. 

During the propagation and the execution of the request the client can take 
advantage of the asynchrony to perform other work. When it decides to retrieve 
the reply, it first starts a subtransaction QT3, pulls the RQS’s reply queue, and 
commits QT3. Depending on the reply content, the client may commit or abort 
the top-level transaction. 

The model can support a two-tier application if a server in the first tier calls 
an other server (Server2 in Fig. 12). In this case Serverl in the first tier behaves 
as a client. It starts the request message producing subtransaction QT12 to send 
the request (Req2), performs any other work, and later gets back the response 
in the reply message of subtransaction QT32. QT12 and QT32 are enclosed by 
the subtransaction QT2^ which is started by the RQSl. The RQS2 in the second 
tier executes the message processing subtransaction QT2‘^. 

The examples given below can be generalized to cover the case where a client 
wish to send N request toward N different servers which can in their turn calls 
other servers. The top-Level transaction may contain: 

— One QTl subtransaction producing N request messages, 

— N message processing subtransactions noted QT2\, i G [l,iV], (which should 

be read QT2 number “i” at the level “1”), and, 

— N message reply subtransactions noted QT3j, i G [l,fV]. 

Each QT2\ can be structured as a top-level transaction and contains one QTl^ 
(in the level 2) and several QT2^ and/or QT3^. 



5.2 Structuring the Asynchronous Nested Transaction Model 

Application Level. Let us return to the credit/debit application to illust- 
rate the use of the Asynchronous Nested Transaction model. In this example. 
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Fig. 12. Two-Tier Application Using the ANT Model 



we suppose two servers: bankl and bank2 that respectively offer the application 
objects 01 and 02 and the Compensator objects Cl and C2. The object 01 pos- 
sesses the creditl operation and the object 02 possesses the debit2 operation. 
The Compensator objects Cl and C2 implement two compensation operations: 
debitl and credit2. A client starts a top-level transaction in which it creates the 
subtransaction QTl. QTl involves the creditl (100) and the debit2{100) opera- 
tions. 



Transaction Tree Expansion and Control. During the active phase of QTl, 
the coordinator reference is passed to the recoverable queueing system. When it 
commits, the recoverable queueing system starts two ANTs: QT2\ and QT2\. 
QT2\ involves the bankl server and QT2\ involves the bank2 server. 

To be part of the client’s top-level transaction the recoverable queueing sy- 
stem use the top-level coordinator reference to create QT2\ and QT2\. 

The recoverable queueing system commits QT2\ and QT2\ using the opera- 
tion definite-Commit with the Compensator objects Cl and C2 and the money 
amount received from the client. Thus, the ONT-OTS-aware owns the appro- 
priate object references and data to compensate all effects of QT2\ and QT2^. 

When the recoverable queueing system has to return the two reply messages 
of the creditl and debit2 operations, it creates the subtransactions QT3j and 
QT2\ by using the top-level coordinator object. Thus QTS) and QT2\ become 
part of the top-level transaction. 



Application Control and Concerns. Assume that an unrecoverable failure 
occurs on bankl, leading to a rollback of the subtransaction QT2\. On the other 
hand, if QT2^ commits normally. The Reply-Handler receives in the QTSj sub- 
transaction the QT2\'s reply which is TRANSACTION_ROLLBACK. In this 





Open Nested Transactions in Multi-tier Applications 



189 



case, the Reply-Handler has to rollback the top-level transaction, and then the 
ONT-OTS-aware will automatically compensate QT2\ by calling the operation 
compensate {100) on the Compensator object C2. The operation compensate in- 
ternally performs the operation credit2{100) . 



The Asynchronous Nested Transaction Consistency. In order to prove 
the consistency of the ANT model we propose to analyze the failure cases in 
a simple topology containing a top-level transaction with QTl, QT2, and QT3 
subtransactions. Five potential failure points are examined. They are noted A, 
B, C, D, E in Fig. 13. 



A 

B 

C 

D 

E 



Top-Level 



— 


Server’s (i) queue 


QTl ► 















I QTl 



QT3* 



Reply-Handler queue 



Fig. 13. Failure Cases Analyze 



— (A) if the recoverable queueing system tries to create the QT2 subtransac- 
tion, it will not be able the top-level transaction rollback occurs during the 
active phase of QTlihen QTl is forced to abort. Thus, no request messages 
will be sent. 

— (B): If the top-level transaction rollback takes place after the commit of 
QTl but before the starting of QT2, according to the transaction queueing 
principles, the message will not be immediately removed from the server’s 
queue. 

However, when the recoverable queueing system tries to start QT2 the coor- 
dinator object reference becomes invalid, since the top-level transaction rolls 
back. At that moment the recoverable queueing system discards the request 
message from the server’s queue. This can be viewed as the compensation of 
QTl, which is performed by the recoverable queueing system itself. 

~ (C): If the rollback of the top-level transaction occurs during the active phase 
of QT2, QT2 is forced to abort. 

As in the (B) case, the request message is kept in the server’s queue and the 
server connector’s trial to create QT2 will be unsuccessful. In this case, the 
recoverable queueing system discards the request message. 
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— (D): If the rollback of the top-level transaction occurs after the commit of 
QT2 but before the starting of QT3, QT2 will be compensated by executing 
the compensate operation. 

Later, it will not be possible to start QT3 because the coordinator object 
reference is invalid. Consequently, the reply message is discarded. 

~ (E): If the rollback occurs during the active phase of QT3, the QT3 is forced 
to abort, and the reply message is discarded as in the (D) case. 

To summarize, when the top-level transaction rolls back: 

— all active subtransactions are forced to roll back, 

— all committed subtransactions are compensated, and 

— all coming subtransactions that would like to tie-up the rolled back top-level 
transaction are prohibited to start. 

The result is that the atomicity execution semantic wished by the client is achie- 
ved, and the announced requirements Al, A2, and A3 described in Section 5.1 
are fulfilled. 

6 Conclusion 

In this paper we have described an extension to OTS to support the open ne- 
sted transaction model which relaxes the isolation property by releasing locks 
acquired by a nested transaction at its completion rather than transferring them 
to the parent transaction as done by the closed nested model, which is already 
optional in OTS. This relaxation is suitable for certain classes of application, 
where it is known that resources acquired within a transaction can be “released 
early”, rather than having to wait until the transaction terminates. However 
certain (typically application specific) compensation activities may be necessary 
to restore the system to a consistent state, when a parent transaction rolls back. 

In terms functionalities and internal mechanisms, we have completely used 
those already provided by OTS. To terminate a nested transaction with open 
semantics, we have used the two-phase commit protocol provided by OTS. To 
structure a transaction into open nested transactions, we use the nested creation 
and deletion as rules provided by OTS, thereby maintaining a nested coordinator 
for each. The benefits are that a Compensator object is created and its reference 
maintained by coordinators until top-level commits or any ancestor (including 
the top-level rolls back). Thus the OTS makes the compensate operation invo- 
cation guaranteed whenever specified and necessary. 

Since open nested transactions has the same tree structure as closed nested 
transactions, any ancestor rollback is propagated to its subtransactions. This 
releases the application from handling a mean for rollback signaling and pro- 
pagation in the case of where a long-lived activity is split into several top-level 
transactions. At the present time, a new request for proposal recently adopted 
by Object Management Group [21] addresses the need for providing application 
structuring mechanisms using OTS transactions to increase concurrency. A ser- 
vice involving the concept of compensation and using an unchanged OTS could 
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be proposed by a set of top-level transactions on top of OTS. However, an addi- 
tional specific context and protocol specific for this service is needed to maintain 
a path through top-level coordinators, in the case of interposition, while similar 
mechanisms are already provided by OTS, as described in this paper. 

The atomicity of a unit pf work in a distributed environment remains cru- 
cial for multi-tier applications and do not have to be abandoned even if the 
communication paradigm is asynchronous. In this paper, we have proposed the 
Asynchronous Nested Transaction model to overcome the limits of the current 
messaging products and standard specifications when they are confronted with 
the problem of guaranteeing the atomicity of distributed multi-tier transactional 
applications. This model is an integration of an open nested transaction model 
and queueing transaction principles. 

We have also proved the feasibility of our model by building a framework that 
integrates our ONT-OTS-aware prototype MAAO-OTS [12] and our recoverable 
queueing system ATCS (Asynchronous Transactions Coupling System). 
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Abstract. Electronic Commerce over the Internet is one of the most 
rapidly growing areas in todays business. However, considering the most 
important phase of Electronic Commerce, the payment, it has to be no- 
ted that in most currently exploited approaches support for at least one 
of the participants is limited. From a general point of view, a couple 
of requirements for correct payment interactions exist, namely different 
levels of atomicity in the exchange of money and goods of a single cu- 
stomer with different merchants. Furthermore, as fraudulent behavior of 
participants in Electronic Commerce has to be considered, the ability 
to legally prove the processing of a payment transaction is required. In 
this paper, we identify the different requirements participants demand 
on Electronic Commerce payment from the point of view of execution 
guarantees and present how payment interactions can be implemented 
by transactional processes. Finally, we show how the maximum level of 
execution guarantees can be provided for payment processes in a natu- 
ral way by applying transactional process management to an Electronic 
Commerce Payment Coordinator. 



1 Introduction 

Along with the enormous proliferation of the Internet, Electronic Commerce is 
continuously gaining importance. The spectrum of applications that are subsu- 
med under the term Electronic Commerce leads from rather simple orders per- 
formed by Email to the purchase of shopping baskets consisting of several goods 
originating from different merchants while electronic cash tokens are spent for 
payment purposes. 

Remarkably, Electronic Commerce is a very interdisciplinary research area. 
As existing approaches are powered by different communities (i.e., cryptography, 
networking, etc.), they are very heterogeneous in nature and thus always focus 
on different special problems. From the point of view of the database community, 
atomicity properties have been identified as one key requirement for payment 
protocols in Electronic Commerce [14,15]. The more complex interactions with 
consumers and merchants become, the more dimensions of atomicity have to be 
addressed. In the simplest case, only money has to be transferred atomically from 
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the consumer to the merchant. However, considering complex shopping baskets 
filled with (electronic) goods from several merchants, atomicity may also be 
required for the purchase of all these goods originating from different possibly 
independent and autonomous sources, along with the atomic exchange of money 
and all goods. 

Due to their distributed nature, protocols that have been suggested to sup- 
port payment atomicity in Electronic Commerce impose high requirements on 
the participating instances (e.g., NetBill [4]). In these approaches, each partici- 
pant not only has to implement a given set of interfaces. Since the application 
logic of these payment approaches is not centrally defined but distributed to all 
participants, they also impose high prerequisites to the participating instances. 

However, with a centralized payment coordinator, the complex interactions 
of the various participants can be embedded within a payment process, thus 
reducing the prerequisites for merchants and customers to participate in Elec- 
tronic Commerce. Transactional process management [12] can then be exploited 
in order to provide the necessary execution guarantees for Electronic Commerce 
payment processes in a natural way. 

This paper is structured as follows: In Section 2, we provide a general fra- 
mework for Electronic Commerce payment interactions. Based on this frame- 
work, we analyze the different atomicity requirements for Electronic Commerce 
payment (Section 3). Then, in Section 4, we shortly summarize transactional 
process management and describe how these ideas can be exploited in order to 
let Electronic Commerce payment process benefit from the execution guaran- 
tees provided by a payment process coordinator. Section 5 finally concludes the 
paper. 

2 Schema for Payment Protocols in Electronic Commerce 

The description for sales interactions in non-electronic markets [11] encompasses 
three phases: information, negotiation, and payment. During the information 
phase, a customer evaluates and compares the offers of several merchants. After 
selecting the best offer, she negotiates with the chosen merchant the conditions 
for the deal (negotiation) . If they reach an agreement, the last step (the payment) 
involves the money transfer from customer to merchant and the service (the 
merchant fulfills his contract). 

Most electronic payment systems only focus on the money transfer of the 
last phase. Our view of an electronic payment scheme also considers the systems 
and protocols for accomplishing both the money transfer and the service. 

2.1 Participants 

An electronic payment scheme involves participants originating from two distinct 
worlds: on the Internet side there are the customer, the merchant, and the paym- 
ent server (also known as payment gateway) as a third entity which coordinates 
the former ones. The other side is represented by the financial world with its 
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proprietary network infrastructure and protocols. The participants are financial 
institutes and again the payment server, that has to consistently transform the 
data flow on the Internet side in corresponding “real world” money flow. The 
participants are depicted in Figure 1. 



2.2 Steps of an Electronic Commerce Transaction 

Prior to the payment transaction, the participants are involved in an initializa- 
tion phase, depicted in Figure 1 by dashed arrows. Both customer and merchant 
have to establish accounts within the financial institutes “issuer” (or “acquirer”, 
resp.). The transformation of the electronic money into real money is performed 
using these accounts. Also in this phase the customer receives from his bank 
a customer secret which enables him to perform electronic payments. The cu- 
stomer secret is visible only for the customer herself, for the issuing bank and 
(eventually) for the payment server. The most common form of the customer 
secret is the credit card number, in electronic cash schemes (such as eCash^'^ 
[5]), the customer secret is an E-cash token. Because account operations are rat- 
her less often than payments, we can consider them as part of the initialization 
phase. 



( 5 ) 




Fig. 1. Generic payment steps 
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Almost all the payment schemes contain the following steps, marked in Figure 
1 with numbers 1 to 5: 

— Negotiation (1): the customer selects the desired service or merchandise she 
wants from the merchant, and negotiates with the merchant the price of the 
service. The result of this step is the Order Information. The Order Informa- 
tion is a protocol of the negotiation phase, including service (merchandise) 
and price specification. 

— Payment order (2): the customer sends Payment Information (PI) and Order 
Information (OIc) to the merchant. The OIc is the customer’s view of the 
agreement with the merchant. 

— Payment authorization (3): the merchant forwards PI, OIc, Olm and additio- 
nal data to the payment server. Olm is the merchant’s view of the agreement 
with the customer. 

The payment server directly or indirectly verifies the validity of the payment 
information, the consistency of the payment using OIc and Olm. It eventually 
triggers the real world money transfer using its role on the non-Internet side. 
At the end of the payment authorization, the merchant receives a confirma- 
tion message C from the payment server (4). 

~ Purchase response (5): The merchant sends himself a confirmation to the 
customer. In case of electronic (non-tangible) goods, the purchase response 
can be immediately followed by the merchandise or the service itself. 

In most existent payment protocols, the payment server is invoked by the mer- 
chant. This is not an intrinsic restriction, and communication between customer 
and payment server is also possible. 

2.3 Characteristics of Payment Protocols 

Several criteria serve as classification models of electronic payments schemes. 
Starting from the moment of transformation of real money into electronic mo- 
ney, payment protocols can be split in pre-paid systems and pay-by-instruction 
ones. Atomicity is another item, which will be discussed in detail later. Some 
protocols introduce the notion of provability, which is the ability of each party 
to prove their correct interactions. Anonymity is especially addressed by cash- 
based-systems. There are also implementation issues like scalability, flexibility, 
eflflciency, ease of use and off-line operation, which are also important because 
of the large number of persons expected to use these systems. 

3 Atomicity in Electronic Commerce 

One key requirement in Electronic Commerce is to guarantee atomic interactions 
between the various participants in Electronic Commerce payment. As Electro- 
nic Commerce and thus also payment takes place in a highly distributed and 
heterogeneous environment, various aspects of atomicity can be identified: aside 
of money and goods atomicity [14,15], also the atomic interaction of a customer 
with multiple merchants is needed. In what follows, we analyze and classify these 
different atomicity requirements in detail. 
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3.1 Money Atomicity 

The basic form of atomicity in Electronic Commerce is associated with the trans- 
fer of money from the customer to the merchant. This is denoted by the term 
money atomicity [14]. As no viable Electronic Commerce payment solution can 
exist without supporting this atomicity property, multiple solutions have been 
proposed or are already established [8,5]. However, the atomicity property is 
tightly coupled with the protocol architecture and design. 

3.2 Certified Atomic Delivery 

Aside of money, also goods have to be transferred. Therefore, a further require- 
ment is that the delivery takes place atomically. This can even be reinforced in 
that both associated parties — customer and merchant — require the necessary 
information in order to prove that the goods sent (or received, resp.) are the ones 
both parties agreed to in the initial negotiation phase {certified atomic delivery, 
encompassing the goods atomicity and the certified delivery described in [14]). 
This strengthened requirement results from the fact that — in contrast to tra- 
ditional distributed database transactions where only technical failures have to 
be addressed — in Electronic Commerce also fraudulent behavior of participants 
has to be coped with. 

Especially when dealing with goods that can be transferred electronically, 
the combination of money atomicity and certified delivery is an important issue. 
In [3], this is realized by a customized Two-Phase-Commit protocol (2PC) [6]. 
To support both dimensions of atomic interactions and to avoid a payment coor- 
dinator to deal with the goods to be transferred, cryptographic mechanisms are 
applied. Prior to the payment process, the merchant sends the ordered goods in 
an encrypted way to the customer. On successful termination of the payment, 
the coordinator has both to transfer the money from the customer to the mer- 
chant and the key needed for the decryption of the previously received goods to 
the client in an atomic way. 

3.3 Distributed Purchase Atomicity 

In many Electronic Commerce applications, interaction of customers is not li- 
mited to a single merchant. Consider, for instance, a customer who wants to 
purchase specialized software from a merchant. In order run this software, she 
also needs an operating system which is, however, only available from a diffe- 
rent merchant. As both goods individually are of no value for the customer, 
she needs the guarantee to perform the purchase transaction with the two dif- 
ferent merchants atomically in order to get both products or none. Distributed 
purchase atomicity addresses the encompassment of interactions with different 
independent merchants into one single transaction. 

This problem is in general reinforced by the fact that different heteroge- 
neous interfaces are involved and different communication protocols are suppor- 
ted by the participating merchants. To this end, in order to support distributed 
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purchase atomicity in very heterogeneous environments with applications using 
communication protocols for which there are no transactional variants (such 
as, for instance, HTTP), the Transaction Internet Protocol (TIP) [7] has been 
proposed. TIP is based on the Two-Phase-Commit protocol (2PC). The main 
idea of this protocol is to separate communication between transaction managers 
from the application communication protocol (two-pipe-model). While commu- 
nication at transaction manager level takes place by the TIP 2PC protocol, ar- 
bitrary protocols can independently be exploited at application communication 
level (such as, for instance, HTTP). 



3.4 Summary of Atomicity Requirements 

In Figure 2, the three dimensions of atomicity that can be identified in Electronic 
Commerce applications are depicted. Most currently deployed payment coordi- 
nators only support money atomicity while some advanced systems also address 
distributed purchase atomicity. However, to our best knowledge, all three dimen- 
sions are not provided by existing systems and protocols although the highest 
level of guarantees would be supported and although this is required by a set of 
real-world applications. 



Distributed Purchase Atomicity 




Fig. 2. Classification of Atomicity in Electronic Commerce 



This lack of support for full atomicity in Electronic Commerce payment is addres- 
sed in this paper where we apply transactional process management (Section 4) 
to realize an Electronic Commerce payment coordinator. 
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4 Transactional Process Management 

In this Section, we introduce the theory of transactional process management 
that provides a joint criterion for the correct execution of processes with res- 
pect to recovery (when failures of single processes have to be considered) and 
concurrency control (when multiple parallel processes access shared resources 
simultaneously) and we point out how this theory can be applied for payments 
in Electronic Commerce. 

4.1 Overview 

In conventional databases, concurrency control and recovery are well understood 
problems. Unfortunately, this is not the case when transactions are grouped 
into entities with higher level semantics, such as transactional processes. Since 
concurrent processes may access shared resources simultaneously, consistency 
has to be guaranteed for these executions. 

Transactional process management [12] has to enforce consistency for concur- 
rent executions and, at the same time, to cope with the added structure found 
in processes. In particular, and unlike in traditional transactions, processes in- 
troduce flow of control as one of the basic semantic elements. Thus, it has to 
be taken into consideration that processes already impose ordering constraints 
among their different operations and among their alternative executions. Si- 
milarly, processes integrate invocations to applications with different atomicity 
properties (e.g., activities may or may not be semantically compensatable) . 

The main components of transactional process management consist of a co- 
ordinator acting as top level scheduler and several transactional coordination 
agents [13] — one for each subsystem participating in transactional processes — 
acting as lower level schedulers. Processes encompass activities which are in- 
vocations in subsystems scheduled by the coordinator. Firstly, the execution 
guarantees to be provided by the coordinator include guaranteed termination, 
a more general notion of atomicity than the standard all or nothing semantics 
which is realized by partial compensation and alternative executions. Secondly, 
the correct parallelization of concurrent processes is required and thirdly, by ap- 
plying the ideas of the composite systems theory [1], a high degree of parallelism 
for concurrent processes is to be provided. 

The key aspects of transactional process management can briefly be summa- 
rized as follows: The coordinator acts as a kind of transaction scheduler that is 
more general than a traditional database scheduler in that it 

i. ) knows about semantic commutativity of activities, 

ii. ) knows about properties of activities (compensatable, retriable, or pivot, ta- 

ken from the flex transaction model [9,16]), and 

iii. ) knows about alternative executions paths in case of failures. 

Based on this information, the coordinator ensures global correctness but only 
under the assumption that the activities within the processes to be scheduled 
themselves provide transactional functionality (such as atomicity, compensata- 
bility, order-preservation, etc.). 
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4.2 Application of Transactional Process Management in Electronic 
Commerce 

According to [10], trade interactions between customers and merchants can be 
classified in three phases: pre-sales, sales, and post-sales. While the sales phase 
has a well-defined structure (especially the payment processing, see Section 2), 
this is in general not the case for the pre-sales and the post-sales phase. 

Due to this well-defined structure, processes are a highly appropriate means 
to implement the interactions that have to be performed for payment purposes. 
Furthermore, the atomicity requirements for payments in Electronic Commerce 
can be realized in an elegant way by applying the ideas of transactional process 
management in an Electronic Commerce Payment Coordinator. 

Based on the NetBill protocol guaranteeing both money atomicity and certi- 
fied atomic delivery, payment processes can be enhanced to additionally provide 
distributed payment atomicity. To this end, and in contrast to the currently ap- 
plied payment schemes, the payment has to be initiated by the customer. She 
has to invoke a payment process at the Payment Coordinator by specifying the 
payment information PI and all n bilaterally agreed Order Information (and 
thus also all different merchants) that have to be encompassed within one single 
payment transaction. Therefore, a tuple {OIc,M)j with Order Information OIc 
and Merchant Identifier M for each product j with 1 < j < n has to be sent to 
the Payment Coordinator. Within the payment process invoked, the necessary 
steps are taken to guarantee all three dimensions of atomicity. The Payment Co- 
ordinator first contacts all merchants involved and collects the merchant’s views 
on the Order Information {OIm)j- Then, in order to determine the success of the 
payment transactions, {OIm)j and {OIc)j are compared for each product j. In 
the case of success, the Payment Coordinator collects all keys from all merchants 
participating in the transaction, checks the validity and the value of the E-cash 
token received and atomically delivers all keys to the customer while at the same 
time initiating the money transfer to the merchants and sends a confirmation 
Cj to all merchants. In case that {OIm)j and {OIc)j do not match for some j, 
some keys are not available, or the E-cash token is not correct, the Payment 
Coordinator aborts the payment transaction and no exchange will take place. 

Electronic Commerce payment can benefit from a Payment Coordinator ba- 
sed on transactional process management ideas in several ways. Firstly, as ap- 
plication logic is centrally defined, the prerequisites for the participants of Elec- 
tronic Commerce trade (customers, merchants, banks) are minimized. Secondly, 
with the inherent structure of payment processes invoked by a customer, it is 
possible to provide all dimensions of atomicity identified as requirements in El- 
ectronic Commerce. This is thirdly enhanced by additional properties as, for 
instance, the possibility to legally prove correct execution of a payment process 
by persistently logging process execution. This process log is part of the transac- 
tional process management and thus, provided by the Payment Coordinator in 
an elegant and straightforward way. Finally, by executing payment processes by 
a Payment Coordinator, the monitoring of the state of a payment interaction is 
facilitated compared with the distribution found in current payment protocols. 
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5 Conclusion 

This paper provides a detailed analysis of requirements participants in Electro- 
nic Commerce payment impose with respect to atomicity issues. Different levels 
of atomicity can be identified which, however, are not simultaneously provided 
by existing approaches. Using the notion of processes, it has been shown that all 
payment interactions can be embedded into a single payment process where all 
possible levels of execution guarantees can be provided while at the same time 
the prerequisites of the participants are reduced. Finally, by applying the ideas 
of transactional process management, it has been shown how a Payment Coor- 
dinator supporting atomic and provable payment processes can be developed. 

This process-based Payment Coordinator is currently being implemented wit- 
hin the Wise system [2]. Based on this implementation, we will in our future 
work extend the analysis of payment processes to further properties (such as, 
for instance, anonymity, scalability, or flexibility). Our goal is to decouple these 
properties, to identify the building blocks needed to realize them and to flexibly 
generate payment processes with user-defined properties by plugging together 
the building blocks needed. 
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private customer base has been created for electronic commerce. E-commerce is 
rapidly expanding in USA and Europe and Japan are following the trend. 

So far, the development of E-commerce has happened in a rather unregulated 
way especially in USA, where the philosophy has been to let the market ) nd 
the best practices. In Europe, the European Commission has been developing 
a regulatory basis in form of directives. Many of the directives regulating E- 
commerce are still in a proposal stage, unfortunately. 

A di erence between USA and Europe is that in Europe more emphasis 
is put on the consumer protection aspects and this is believed to be achieved 
by writing down the necessary requirements into the directives and later into 
national legislation [30] . The regulative tool set in EU is thus a bunch of directives 
where the principles applicable are laid down. They should then be incorporated 
into the national legislation in the member countries. Japan seems to be behind 
USA and EU in this development but their initial ideas concerning regulation 
seem to be close to the European ones [28]. 

Unfortunately, there are currently over twenty directives or directive propo- 
sals that regulate or will regulate di erent aspects of electronic commerce (see 
[4,6,5,18,16,17,15,10,11,9,29,13,7,8,14] . 

Currently, this makes the environment rather di cult for the players in the 
) eld, until the issues have been settled. From technical point of view, there is the 
risk that merchants in di erent countries set up E-Commerce servers that impose 
di erent protocols between the customer and the server. This makes automation 
of the services cumbersome at the customer end and the di erences might deter 
the customers from E-Commerce. 

The European Commission has now decided to restructure the regulatory fra- 
mework. After that, it should only consist of about ) ve to six directives that co- 
ver usage and licensing of both telecommunication networks, broadcast-oriented 
terrestrial and satellite TV networks, and computer networks, especially IP net- 
works [24]. So far, the directive proposal [29] is rather central from our point of 
view, because it regulates to some extent the structure of the business processes 
at the merchant and customer. Third parties are not really addressed (intermedi- 
ary is de) ned, but it concerns network operator and similar instances providing 
the infrastructure . Another important one is [15] that lays down the principles 
for electronic signatures. 

The legislators in EU consider Information Society Services (ISS as a major 
innovation. These are services provided at distance through a communication 
network using some form of electro-magnetic carrier. The service is always re- 
quested by a customer, i.e. the interaction is not started by the merchant [29], 
Art. 2. Contracts can be made electronically, except contracts that require not- 
ary, registration by a public authority, concern family law, or law of succession 
[29], Art. 9. 

The above directive proposal [29], Art. 5 states that merchant must give ne- 
cessary information in order to be identi) ed. The information includes also the 
physical address that determines which legislation is applied concerning the (mo- 
bile E-commerce transaction (origin of source issues . The E-commerce taxation 
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directive [11] says that the indirect taxes should be paid according to the law of 
the country, where the merchant has the residence (taxation of origin principle 
Electronic goods are considered as services in EU and treated accordingly in 
respect of taxation. 

GSM and other wireless networks and especially Wireless Application Proto- 
col technology designed for GSM and subsequent mobile networks [21] have now 
opened access to Internet for hand-held mobile terminals. Bluetooth technology 
[20] will further enhance the sphere of mobility. Both facilitate also mobile E- 
commerce. Using these technologies, both customer and merchant can now be 
mobile, although it is probable that a customer is more mobile than a merchant. 

In this article we review the need of a transaction model and corresponding 
transactional mechanisms and their usefulness for E-commerce in general and 
for mobile E-commerce in particular. We tackle the issue both theoretically and 
empirically. In Section 2 we review the work done so-far in the ) eld. Section 3 
consists of trials of three E-commerce sites, two in Finland and one in USA. The 
trials show important di erences in the structure of E-commerce transactions in 
di erent cases that must be taken into consideration when transactional support 
is developed further. In Section 4 we discuss the transactional requirements for 
“) xed” E-commerce. In Section 5 we analyze the transactional and other related 
requirements in the mobile environment. Section 6 concludes the paper. 

1 r 

The work in the transaction ) eld started at the beginning of the seventies. In 
the eighties new application areas emerged and new transactional issues were 
confronted. The book [19] is a rather good overview on the work done during 
eighties in the ) eld. 

Electronic commerce is a new application area needing transactional sup- 
port. One of ) rst well-known analyzes in this application context is [34]. In it, 
the author concentrates on the atomicity aspects of E-commerce transactions, 
introducing money atomicity, goods atomicity, and certi ) ed delivery. Money ato- 
micity states that funds transfers must be implemented in such a way that the 
money is either moved from one party to another or not moved at all, i.e. the 
money is not “lost” ^ . It is worth noticing that the Semantic transaction model 
aiming mainly at money atomicity in distributed, highly autonomous interna- 
tional banking environment was introduced already in the eighties and analyzed 
thoroughly in [35,36]. The term semantic atomicity was used, because the mea- 
ning of atomicity can be semantically sped) ed and enforced in the S-transaction 
model. 

Goods atomicity guarantees money atomicity and in addition to it the se- 
mantic constraint; “The customer will get the goods if and only if the money 
is transferred from her to the merchant” . Finally, certi) ed delivery guarantees 
goods atomicity and additionally that both customer and merchant are able 
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to prove exactly which goods were delivered. The latter protects the customer 
against a fraudulent merchant (who might cheat by delivering wrong or defect 
goods or delivering nothing and the merchant against a dishonest customer, 
who would claim that the goods delivered deviated in some sense from those 
ordered. Tygar continues the work in [33], where he discusses whether and how 
anonymity and atomicity could be combined. 

The above considerations do primarily target digital goods and a three-party 
structure for the transactions (customer- merchant - bank/credit card company 
Delivery of physical (or tangible goods poses further complications, because e.g. 
digital signatures cannot be used to certify the goods themselves. 

Three parties can also be a too small number; the customer might want to 
buy e.g. software X that requires some other software Y to run. Y is, however, not 
obtainable from the merchant of X and thus the customer has to buy either both 
X and Y or none of them. This kind of scenario requires a further transactional 
property called Distributed Purchase Atomicity in [31] that guarantees money 
atomicity and goods atomicity for a set of dependent purchases at di erent 
merchants. 

Secure Electronic Transactions (SET is probably the best known commerci- 
ally developed standard [25] . It is aimed at three-party E-commerce transactions 
(customer-merchant-credit card company . From transactional point of view it 
mainly addresses money atomicity, i.e., it guarantees that the payment is per- 
formed if and only if the customer (credit card holder authorizes it and the 
sum debited to the customer card is the correct one. The mechanism assumes 
a sophisticated certi) cate infrastructure that guarantees the correct identity of 
the participants. 

Pioneering work considering the secure payment protocols has been perfor- 
med by Michael Waidner and his group at IBM Zurich. A recent overview of 
the so-called iKP family of protocols, closely related to SET, can be found at 
[3]. The work done in part by the above laboratory and other partners in the 
European Semper project (AC026 is also of pioneering nature in the ) eld [32]. 

The above work mainly concentrate on business-to-customer transactions. 
Requirements for transactions in business-to-business E-commerce with empha- 
sis on value chains have been discussed e.g. in [27] and general requirements, 
regulatory frameworks and their relevance for mobility aspects have been dealt 
with in [22]. 



3 p ri c s i rr - rc s s 

We took a closer look at the following three sites: 

— .k Itai porssi.fi (a Finnish WWW-based customer-to-customer market 
place , 

— .ama o .com (a famous electronic bookstore in USA , and 

— . okus.com (a Nordic book store . 
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3.1 Keltainen P3rssi: A Finnish Electronic Market Place 

We will look at the site from service ordering angle. It is interesting, becanse it is 
possible to pay for the service (reading the announcements of other cnstomers 
also by Internet banking using one s own acconnt. This is our main interest in 
this case. The interactions are presented in Fig. I. 
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Login and chA 



Ask to pan 
Cor 



Confirnn the 



Confirm 



Keltainen porssi 



ose service 
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It is worth noticing that the control and data both *ow through the customer 
site, in practice throngh the cnstomer browser. Thns, the service provider sends 
the data concerning the customer service order to the customer who can check 
it through the browser. Some of the ) elds contain information to the bank, snch 
as the transaction id. It also contains the URI at the bank through which these 
kind of services at the bank can be used by the customer and the merchant. 

From communication security point of view, all communication happens 
nsing HTTPS, the encrypted form of HTTP. While dealing with the bank the 
customer uses his or her PINs, in a similar way as when the bank services are 
nsed in a peer-to-peer fashion. Thus, the security risk does not increase in this 
case as compared to direct usage of Internet banking services. 

The bank and the service provider gnard against malicions customers by pro- 
viding an information *ow directly between them. Thus a customer who would 
try to fake the payment should do it in his or her machine and, additionally, he 
or she should make the service provider side (in this case Keltainen P32’ssi to 
believe that the bank has sent the direct con) rmation abont the payment, even 
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if the customer faked it in a way or the other. Depending on how well the system 
at the service provider is designed and how secure is the authentication between 
the bank and service provider, this might or might not require direct penetration 
to the system of the service provider. Of course, the penetration might be tried 
at the bank, too, but there is evidently less hope of breaking into their systems 
than to those of a service provider. 



Achieving Atomicity? From transactional point of view, the three-party inter- 
actions pose questions. Which degree of atomicity is achieved? Money atomicity 
is guaranteed to the degree the banking system can perform it in general; the 
accounts might be in di erent banks, so domestic inter-bank funds transfer, or 
even an international bank transfer might be needed. 

Goods atomicity is a bigger problem. It is evident that the money might be 
taken from the customer account, but the service provider does still not grant 
the service. We discuss this below. The possible case, where the service is granted 
without the customer to pay does not occur, unless the customer could cheat 
the merchant in the way described above. 

The weak point from the goods atomicity point of view is the following: the 
information about the payment (control *ow from the bank to service provider 
must arrive via two paths, both through the customer and directly. If even one 
control *ow brakes the service is (evidently denied. In addition to technical 
problems like failures at any of the three parties, the control *ow is mediated 
manually at the customer site. In the current implementation, the customer gets 
a form back from the bank on which he or she is advised to close the transaction 
by pressing the “end-button” provided on the form. Behind the button there is 
an URI provided by the service provider and the identi) cation of the payment 
transaction, along the notice that it was successful. Upon pressing the button, 
the browser sends a HTTP request to the server of the service provider, using the 
URI meant for con) rmations, along the parameter values. It probably also sends 
the previous URI visited that in the normal case should be that of the bank. 
The ) nal con) rmation of the granted service comes from the service provider 
server as a response to the request above. 

Thus, the goods atomicity of the combined payment and service provision 
transaction is achieved manually. There are several places where it is threatened. 
Maybe the most evident one is crash of the customer workstation or the critical 
software (browser or the communication software after he or she has con) rmed 
the payment and the bank has done it, but the “end-button” has not been pus- 
hed. In this case the customer either does not at all get the critical con) rmation 
information from the bank or fails to press the end button and thus fails to con- 
vey the con) rmation to the server. Trying later to convince the service provider 
of the payment will be rather di cult, because the necessary information has 
evidently been lost. 

Another thing is that even if there was no crash at the customer site, con- 
) rmation for the payment is based on actions of people who do not necessarily 
understand the necessity of following the instructions. Should the customer ig- 
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nore the advice and not press the “end-button”, it is likely that the service is 
denied by the service provider, because it does not get the other necessary con- 
) rmation for the payment. It is currently unclear, whether the customer can be 
made liable for loosing money in the case he or she does not follow instrnctions 
given on the form returned by the bank. 

Certi) ed delivery is achieved, provided the nser can use the service in the 
speci) ed way and for the paid time. The latter property can only be veri) ed 
after the entire subscription period has elapsed. 



3.2 A Bookstore in USA: Amazon.com 

This company has its headqnarter in Seattle, USA and activity in several places 
in USA and Europe. 



Taxation and Legislative Framework A ecting the Customer. The ap- 
plicable law for E-commerce transactions for the address amazon.com is that 
of State of Washington^ . The site gives the following information on what they 
record of the cnstomer: “When you order, we need to know your name, e-mail 
address, mailing address, credit card number, and expiration date. This allows 
ns to process and fnl) 11 your order and to notify yon of your order status” [23]. 

Sales tax is only levied for shipments where the destination address is State 
of Washington or Nevada in USA. Other shipments are tax free. The duties 
possibly levied are left for the customer. Notice that from sales tax point of view 
the residence of the customer is decisive, whereas the applicable law with respect 
to disputes is that of Washington. In addition, the customs laws of the customer s 
country of residence may play a role in the shipment. Thus, the customer should 
know the legislation of two or three countries/states in order to master properly 
the purchase transactions in this environment - at least, if something goes wrong. 



The Overall Process at Amazon. The whole overall process is described 
in Fig. 2. The customer can ) rst pick up the items into the shopping cart and 
register at the merchant as part of the payment procedure. The information 
mentioned above is collected. A returning customer is identi) ed when he or she 
enters the site (based on cookies . If the customer has chosen the One-Click 
service mode, further identi) cation is not necessary. When the customer gives 
) nal ok for the transaction, the card information is checked (if changed and the 
value of the items in the shopping cart debited to the credit card account.^ 
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After the customer has accepted the purchase transaction, the information 
(and material *ow is one-directional, from the merchant to the cnstomer. First, 
a message is sent by the merchant to the email-address of the cnstomer when 
the order is placed. It nsually arrives within a minute after placing the order. 
This message looks as follows: 

rom: ord rs@ama o .com 

at s t: u, 23 p 999 : :4 - 7 ( ) 

o: ijalai j@acm.org 

u j ct: Your rd r it ma o .com ( r. 2-43 2468- 39 2 7) 

a k ou for ord ri g from ma o .com! Your ord r i formatio app ars 
lo . If ou d to g t i touc it us a out our ord r, s da -mail 
m ssag to ord rs@ama o .com (or Just r pi to t is m ssag ). 

ma o .com ustom r r ic 

Your ord r r ads as folio s: 

-mail addr ss: ijalai j@acm.org 

ip to: ... 

moirs of a is a” rt ur old ; ap r ack; $7. ac (Usuall 
s ips i 24 ours) ... 
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The next piece of information sent by the merchant is the announcement that 
the items have been shipped and the statement that this completes the order. 

rom: ord rs@ama o .com 
o: ijalai j@acm.org 
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ia -mail (ord rs@ama o .com), ( -2 6-266-29 ) or p o ( -8 - 
2 -7 7 for U custom rs or -2 6-266-2992 for i t r atio a I custom rs). 
a k ou for s oppi g at ma o .com. ... 

After that, the items should arrive to the address speci) ed. 

If the customer is not satis) ed with the product (s , he or she can return the 
items back to the merchant within 30 days of receiving the items. 

U I Y: " ur r tur s polic is simpi . Wit i 3 da s of r c ipt 

of our ord r, ou ma r tur a of t folio i g it ms for a full r fu d: 
a ook i its origi al co ditio , or a ook r comm d d ( ut ou 
did 't jo ) i a CO ditio a u op d music , , H tap , 

or soft ar to s, I ctro ics, tools, om -impro m t it ms, a d a ot r 
m rc a dis i co ditio , it its origi al packagi gad acc ssori s 

I as ot t at ca proc ss r tur s a d r fu ds o I for it ms pure as d 

from ma o .com. You ca also ca c I u s ipp d it ms. i d out mor 
a out o to ca c I..." [ ]. 



Security Considerations. Is the security high enough and authentication pro- 
per in this kind of trading system and could they be improved by transactional 
means? If the returning customer is recognized based on cookies, the http server 
at Amazon asks whether it has resolved the identity correctly (of course, a frau- 
dulent person would not say no . This makes possible the “One-click service"^, 
where the customer does not need to type in any identi) cation information, 
provided he or she is coming from a machine where the old cookies are stored. 
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Rather, he or she can collect books and other items into the shopping cart and 
just acknowledge the bought items at the end of the session. 

This opens np the possibility at the customer site to use the same machine 
to buy books and other items, even if the actnal cnstomer would not want it. 
A further danger is that the service at Amazon o ers possibility to send gifts 
to other persons. A frandulent person could use this possibility to send books 
to him- or herself or to a third person, if he or she gets access to the machine 
of a customer and/or gets somehow hold of the relevant cookie ) le. The gifts 
are sent to a person only after he or she has responded to an email sent from 
the merchant and subseqnently revealed the delivery address, but this does not 
reduce the risk much. 



3.3 Bokus: A Nordic Book Store 

This case is rather similar to the Amazon. The di erence is mainly that the 
bookstore operates in Scandinavian countries and applies the local mail-order 
bnsiness rules. Thus, instead of charging the customer before or upon shipping, 
the store sends the book, along the invoice. The customer can then either pay 
the bill or return the book withont paying. The risk is thus more at the merchant 
side, because the customer will not pay, before the items ordered have arrived 
and they satisfy the customer. 

From structural point of view, the di erence to Amazon is that this is a 
gennine two-party ordering transaction between the customer and merchant. 
Third party might be needed later if the payment happens throngh the banking 
system, but it is conceptually not necessary. 

A di erence to Amazon is that the order is con) rmed by email within three 
days after the order is placed throngh a WWW interface at the server. Thns, the 
customer might be three days in donbt, whether the order will be accepted or 
not. Another interesting di erence is that the bookstore wants to know the birth 
date of the customer. They do not tell why, but evidently they do not necessary 
want to sell to minors. Maybe they also have in mind advertising aspects and 
security aspects, because knowing the birth date makes the address check in 
Nordic countries easier. 

From process point of view, there is a slight di erence to Amazon allowing 
the customer to cancel the order also by email. In this case it remains a bit 
nuclear from the instructions, when the book itself shonld be returned. It is 
neither said very clearly, what will happen if the book is returned and it is not 
in an acceptable condition. 

The bookstore does not tell openly what information they will collect. For 
instance, if I pay the invoice through a bank, they will get the account informa- 
tion and could add it into the customer record. This reduces the risk of having 
customers with a faked identity. 
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4.1 The Lower Level: The Business Process View 

When looking at Fig. 2 one notices that there is a business process performing 
at the merchant site and that this business process has a fairly standard form 
(the business process has a fairly similar form also at the Bokus : 

1. Let the customer ) 11 the cart. 

2. After a new customer provides the identi) cation information, along the credit 
card information, process the customer order: debit the sum to the credit 
card; if ok send con) rmation by email otherwise send negative response. 

3. If the customer cancels the order before the items have been shipped, ackno- 
wledge this and cancel the payment at the credit card company; cancel the 
order from the database; stop the process instance at the state “customer 
cancelled before shipment” . 

4. Send the items to the address given; send the delivery report to the customer 
by email. 

5. Wait until the time to return the items has expired or the items have been 
returned. 

6. If the return time has expired, close the process instance in a state “books 
delivered and paid, not returned” . 

7. The customer returns the items during the 30 days period: process the re- 
turned items; announce to the customer the possible expenditures by email; 
adjust the costs at the credit card company. 

8. Close the order processing process instance in a state “customer returned 
the items; costs adjusted”. 

Looking at Fig. 2 we also see that the business process at the merchant runs a 
rather long time, over 30 days (21 days at Bokus , at least conceptually. It is 
thus a long-lasting process instance. The business process in Fig. 1 normally runs 
only a couple of minutes. This has certain implications for the implementation 
of the processes. They both have, however, a rather similar overall goal: “The 
customer pays if and only if exactly the ordered service/items are delivered”. 
This is another formulation for the certi) ed delivery. 

4.2 Jeopardizing the Certi ed Delivery 

When could the customer loose money, but not get the items in a case obeying the 
business process structure exempli) ed by Amazon? This happens if something 
goes wrong at the merchant site, at the credit card company site, or during 
the delivery, after the merchant has asked the credit card company to charge 
the ordered items and this has been performed. The transaction that causes to 
charge the credit card of the customer at the credit card company is a typical 
business-to-business E-commerce transaction that guarantees money atomicity 
and coherent view of the both sides on the outcome. There is as such nothing 
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new; the participants must carefnlly log the various states of this E-commerce 
transaction and use basic ACID transactions at database to implement it. 

The interaction between the merchant and credit card company is completely 
out of the control of the customer, in contrast to service provision case where 
the customer was responsible for the atomicity. The errors at that stage are not 
very probable, becanse the problems are rather well understood and standards 
like SET [25] have been developed to tackle this problem. 

Another point, where things can go wrong is the shipment after the book has 
been charged. First, the cnstomer might not get the email that noti) es about 
the successful order. The reason might be e.g. wrong email address which again 
can be customer s fault, or could be caused by the problems at the merchant 
system or at some other part in the network. If the message does not leave the 
merchant site it is again an atomicity problem of the implementation of the 
bnsiness process step that noti) es about the success. Other delivery problems 
are more communication-related. 

Should somebody want to misnse one s credit card, the email con) rmations 
should actually be directed to an address not belonging to the card owner. This 
could be done rather easily by delivering an abstract email address that is map- 
ped to another one in a server that does not tell the mapping to outside, when 
the customer identi) cation is given. If somebody found out the password of the 
customer at the merchant, then of course address and other information can be 
temporarily changed so that the credit card owner does not get con) rmation 
about the items purchased behind his or her back. 

Farther, the logistics might fail in delivering the items. This is mostly out- 
side the control of both the merchant and customer and although transactional 
mechanisms might help in keeping track of the status of the delivery process, 
they are outside the scope of this paper. 



Transactional Requirements for the Business Processes and Distribu- 
tion Aspects. Analyzing the abstract goal of certi) ed delivery and possible 
problems above, one easily sees that one needs to guarantee several sub-goals at 
the lower level of abstraction, namely at the business process level and at the 
distribnted system level. The following two requirements should be satis) ed; 

1. The business processes involved rnn until some acceptable end state. Intui- 
tively this means that none of them stops in the middle, bnt each of them 
runs either to a fnll success or complete cancellation. 

2. The end states of di erent processes are semantically coherent with each 
other. This means that if for instance the customer paid for the items or 
service and did nor cancel the order in due time, then the bnsiness process 
ends in the state “service/items delivered”. Otherwise it shonld end in the 
state “not delivered” or “items returned” , depending on when and how the 
cnstomer decided to cancel the order. 

The ) rst requirement above forces to guarantee (semantic atomicity for the 
bnsiness processes. A part of guaranteeing this is to guarantee atomicity of the 
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steps by a lower level mechanisms, like usual database transactions. An impor- 
tant subproblem is to guarantee that the states of the business process are stored 
permanently and that steps involving state changes and messages sent between 
sites are implemented in an atomic way. 

The second requirement can only be achieved by distributed transactions 
means. Abstractly speaking, it is the question of guaranteeing in a distribu- 
ted system that a set of nodes agrees on the same value. The usual problems 
(processing errors, lost messages etc. must be handled. The peculiarity of the 
environment is that it is hostile in the sense that the nodes can not necessarily 
be trusted (there might be malicious customers or merchants . Therefore, the 
possible hostility must be taken into consideration while designing these me- 
chanisms. For instance, Keltainen Pdbrssi does not rely on the customer as a 
coordinator of the transaction, because the customer might fake the other part 
(funds transfer to have happened, even if it had not. Therefore, there is the 
direct information *ow between the bank and the service provider. This makes 
the structure di erent from e.g. 2PC protocol, where the coordinator is assumed 
to be benevolent, respecting the decisions of the subordinates and asking really 
their votes, instead of faking or guessing votes. 

One remedy to the above problems would be that ack or nack would be 
required from the customer to the shipment announcement. Thus, the merchant 
would be able to check that the real customer knows about the shipment and 
really wants it. The customer could also con) rm the reception of the items to the 
merchant. This would alleviate the uncertainty about the success of the delivery 
to the right customer and thus con) rm the goods atomicity aspect. But even the 
above two additional con) rmations would not help very much in the cases where 
a person would just take another person s credit card information and use it to 
purchase items. The only information Amazon can easily check is namely the 
credit card number, the name on it, as well as the expiration date. Everything 
else can be fabricated. Thus, there is nowhere a real authentication of the person 
in the current Amazon system.® 

To guard against these kinds of misuses, an end-to-end authentication feature 
should be incorporated into the system. A person should be able to identify him- 
or herself to the merchant, before the orders could be placed and delivery issued. 
One solution would be to use public and private keys and an infrastructure 
(Certi) cation Authority to guarantee the allocation of the keys to persons. 

Using both strong authentication mechanisms and distributed transactional 
mechanisms, security, authentication, and non-repudiation level can be raised 
much higher than what they are today. On the other hand, authentication com- 
bined with distributed transactional mechanism that guarantees distributed ato- 
micity can also be a problem for the smooth running of the business process at 
the merchant. The degree of autonomy of running the process is reduced, be- 
cause the customer can stop it by not con) rming a step. Con) rmations also 
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reduce the autonomy of the customer, because he or she must issue them. Thus, 
there is clearly a trade-o between the desired level of security and authentica- 
tion and the reduction of the degree of autonomy at both sides. How much are 
the customer and merchants ready to pay for the increased security? 

An evident reqnirement in this context seems to be that the customer should 
be able to choose the security level and at the same time accept the consequences. 
It is reasonable to think that the more valuable items are purchased, the higher 
the need for secnrity and proper authentication. It is much less a problem for a 
customer if a 10 dollar book is not delivered, bnt it is a rather big problem if a 
100000 dollar car gets charged but never arrives. 



Cancelling an Order. Cancellation is a natural part of the business we have 
been dealing with above. For the (Information Society services it is not so 
important, becanse the service is delivered at the same instance as it is paid. 
There might be, however, good reasons to entitle a customer to cancel an order 
for digital goods, too. If a piece of software does not function, the customer 
should have means to cancel the order. 

The cancellation of the delivery of goods is interesting from transactional 
point of view. The often used term is compensation for such a case. In Amazon 
case the customer is entitled to cancel the order as long as the order has not been 
moved to the state “shipped”. A book can also be returned to the merchant, in 
which case the merchant should pay back the price — or at least a part of it. 

It is worth on noticing that cancelling the order before shipment and after 
the delivery have signi) cant di erences from practical point of view, although in 
abstract terms the end state of the process would be the same: “The item was 
not sold and the cnstomer did not pay (almost anything” . 

It is mainly the problem of the merchant to organize the cancelling in a 
smooth way into the business process. 

5rscil cissi il- rc 
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5.1 Relating the Mobile Environment with the Fixed One 

What is the di erence between a mobile environment and a more traditional 
network environment? Mobility can be understood in several ways. One can 
think that a person moves from one physical place to another, but does not 
carry any mobile equipment with him or her. In this case he or she uses the 
locally available () xed network infrastructure to take actions in the network, 
including issuing E-commerce transactions. 

Assnming that the above mobile person does carry with him or her a piece of 
personal equipment that facilitates the access to the network resources we come 
closer to the concept of mobile E-commerce. The most general idea is that a 
customer can condnct E-commerce at any time at any place nsing the miniature 
devices. 
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Overall, our view is that mobile hand-held devices and the supporting net- 
works are a special access technology to Internet or another backbone network 
facilitating many services, including E-commerce. 

5.2 Authentication and Authorization Issues 

The perception of mobility of a person makes evident that authentication and 
authorization are of paramount importance in the mobile environment. By na- 
tnre of mobility, authentication of a customer cannot be based in any respect 
on the physical location or similar criteria. Considering from this perspective 
the mobile authentication and authorization problem, it does not really change 
radically whether or not a person carries a piece of eqnipment or not. The reason 
is that the current small hand-held devices are rather vulnerable to theft. Thns, 
basing authentication of a person on the equipment identity or data stored ther- 
ein is less secure than in the case of a ) xed network and ) xed workstations (cf. 
the cookies in the case of Amazon and should be avoided. For the same reason, 
authorization of mobile E-commerce transactions should not be blindly and so- 
lely based on the device identity of any kind; stealing the device after a proper 
authentication of the correct user would open a full access to all services for a 
wrong person. This has also certain rami) cations for the transactional services 
we have been discussing. 

We discussed above in Section 4 the need to combine end-to-end authoriza- 
tion of a customer and the merchant in di erent phases of the E-commerce tran- 
saction in order to minimize the probability of undetected failures and fraudulent 
orders. Authentication of a person should be based on something personal, like 
) nger prints, eyes, etc. Currently, however, the approach is to use some piece of 
information that is only known by the person to be identi) ed and the system, 
like Personal Identi) cation Number (PIN or password. Another approach is to 
base the identi) cation on something the person possesses and controls, typically 
a smart card. The latter are typically used together with PINs, because steeling 
a smart card is not di cult and the mere possession of it must thus not be 
enough for identi) cation. 

Authorization in mobile environment is a more tricky issue than within a 
) xed environment. Typical general solution is to use private and public key and 
the corresponding infrastructure (cf. [25] . There are several reasons, why the 

private key should be kept in the terminal or at least on a device from which 
it can be automatically read by the terminal; it is a long bit sequence and thus 
impossible to remember by heart; keeping it on paper would expose it rather 
easily to other people; typing it in every time it is needed would be very error 
prone and tedious. In practice, the key should be protected by a much shorter 
PIN that is only known by the person the secret key is attached to. Thus, using 
the authorization key would happen in a tight symbiosis with authentication of 
the user. 

One solution to the latter problem is to store the secret key to a SIM card 
[26] . The operator would be responsible for establishing the link between a person 
and his or her private and public keys and delivering the SIM card to the right 
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person. This would solve the key distribution problem. Storing the private key 
on a smart card readable by hand-held terminal makes it possible to generate 
the digital signatnres on which the authentication and non-repndiation can be 
based. The weak point is that at least currently the access to the SIM card is 
protected by a PIN of four digits only. The PIN cannot be much longer, because 
the people could not remember it any more and would use PINs like lllllll.., 
or paper or other means to memorize it. Thus, too long a PIN would be more 
vulnerable than a shorter PIN. It remains to be seen, whether there will be more 
reliable authentication methods that are not based on PINs in the fnture. 



5.3 Mobility versus Legal Basis 

The legal basis for dispute handling in ISS case seems to be a bit problematic 
thing in EU, because it should be based on the applicable law of the cnstomer s 
residence. The rational behind this is that the cnstomer needs to know only one 
legislation and can be expected to know best the legislation of his own conntry. 
As long as the customer is a citizen of the EU member country, this is not a 
big issne, because one can easily determine where the customer s permanent 
residence is. Mobility does not change this consideration. 

For non-EU customers coming from abroad, this approach becomes a pro- 
blem. Especially in the mobile E-commerce performed by non-EU citizens within 
Europe it is not evident which legislation should be used in dispute handling. 

It would be very di cult to apply legislation of an arbitrary non-EU conntry 
within EU. One solution wonld be to apply the legislation of that member coun- 
try, where the order was placed. This is also rather problematic, because the 
customer might move from one country to another within Europe, while placing 
an order. 

Another evident solution to determine the applicable legislation for mobile 
non-EU customers trading within the borders wonld be to use that applicable 
to the merchant. This would be sound in the sense that all non-EU customers 
could be treated in the similar manner, no matter whether they are inside or 
outside of EU while trading with a merchant. This would correspond to the 
current practice in USA (cf. the Amazon example 

How about taxation? At Amazon the taxes can always be determined, be- 
cause the delivery address is decisive, no matter where the customer was when 
the order was placed. Still problems might arise if the Amazon begins to sell 
digital goods that are delivered to a mobile terminal. The duties are a problem 
of the customer in Amazon s view. The same view is shared by EU. 

In EU, VAT should be paid in some country, if the service or product is 
consumed within EU. If it is paid in the country of origin, then it does not need 
to be paid second time. The cnrrent approach is that the service provider or 
merchant is primarily responsible for paying the VAT, an individual customer 
is not. Individual customers need not pay VAT, if they purchase goods from 
abroad for their personal use, with some minor exceptions [12]. This also applies 
to E-commerce. 
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In general, European Commission pursues a simple, clear and neutral taxa- 
tion framework, as is stated in [12]: 

gal c rtai t a I s comm rc to co duct d i a iro m t r 
t rul s ar cl ar a d co sist tr dud gt risks of u for s ta lia ili- 

ti s a d disput s. implicit is c ssar to k p t urd s of complia c 

to a mi imum. I t at r sp ct t ommissio co ti u s to full com- 
mitt d to t i troductio of t futur com mo s st m as d o 

ta atio at origi a d pro idi g for a si gl cou tr of r gistratio r 
a op rator ould ot accou t for a d d duct ta i r sp ct of all is U 
tra sactio s. 
utralit m a s t at: 

CO s qu c softa atio s ould t sam for tra sactio si goods 

a d s r ic s, r gardi ss of t mod of comm rc us d or t r d li r 

is ff ct d o -li or off-li . 

CO s qu c s of ta atio s ould t sam for s r ic s a d goods 
t r t ar pure as d from it i or from outsid t U. 

Although not likely, there might be states or groups of states where the location 
of a customer or merchant while placing an order would have implications for 
the legislative framework applicable to the corresponding transaction. Currently, 
this would be a problem, because there are no standardized technical means to 
expose the customer location to applications (although the information about 
the location at the base station resolution level is known by the network infra- 
structure . Location determination would be actually a new dimension or level 
in the atomicity discussed earlier, because it presupposes atomic decision about 
the applicable legislation framework based on spatial information and its recor- 
ding, while placing an order. Should the actual location while placing on order - 
or using a service - have some legal consequences in some part of the world, this 
would require a rather heavy technical infrastructure to be supported. 



5.4 Technical Problems in Supporting Mobile E-Commerce 
Transactions 

We take here the view that the dominant form of mobile E-commerce will be 
based on miniature, hand-held devices. Keeping this in mind, one can argue that 
the same or similar E-commerce services as discussed above in Section 3 can be 
o ered to mobile terminals, e.g. to WAP terminals. There is a strong pressure to 
go into this direction among the banks, mobile equipment manufacturers, and 
other service providers. 

What are the problems? We have discussed the basic problems above, as 
concerns the security, authentication, authorization etc., and suggested certain 
transaction-oriented solutions. Mobility does not solve any of the problems dis- 
cussed above, on the contrary. Mobile environment poses rather a question, how 
could even the current mechanisms be implemented in the environment. 
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Currently the hand-held terminals are still more error-prone than stationary 
work stations. In addition, it is rather common that a mobile terminal loo- 
ses the network connection. This happens, because the terminals are e ectively 
C-autonomous [35]; the user decides to turn-o the the radio transceiver, the 
terminal runs out of battery, or simply because the user moves during the tran- 
saction processing to a place where there is no coverage. Should the connection be 
lost during the a service ordering provision transaction, e.g. at Keltainen P9(fssi 
at the critical places described above, it is likely that the customer will loose 
money. This is because, currently, there is no mechanism guaranteeing delivery 
atomicity either at the terminal or at the bank server in the case of crash. 

It is evident that the states of e.g. such a service provision transaction as 
discussed above should be logged more carefully at the three sites. Based on 
the logs, it should be possible to continue the transaction after the terminal 
has been possibly re-charged and rebooted and the communication capability 
has been re-established. It is also evident that providing a more resilient service 
provision than what is o ered now at the example bank and service provider 
requires changes to the current implementation of the services, especially at 
the bank site. Also, the terminal must store more information about the status 
of the transaction in order to be able to continue it after a terminal crash or 
communication failure. This requires some kind of distributed recovery protocol, 
the exact form of which needs to be investigated. 

One evident technical requirement in this context is that the amount of 
information stored for recovery purposes at the terminal must not be large. 
Otherwise, the current smallest hand-held terminals could not cope with it. 
Another requirement is that the algorithms needed to support the transactional 
mechanism, security and authentication at the terminal should not be overly 
complicated, i.e. their time and space complexity should be modest. 

Which requirements would the above business process-oriented approach di- 
scussed in Section 4 pose in the mobile environment? From a mobile device 
point of view, it might be hard to persistently store the entire state of several 
long-running transactions. This would, however, be required in practice should 
the customer want to have several common checkpoints with the merchant s 
process (con) rming the start of the delivery, acknowledging the delivery of the 
goods . Remembering by heart many pending orders and their status would be 
too tedious. 

How many process variants should be supported at the terminal? For both 
Amazon and Bokus cases a rather similar mechanism between the terminal and 
merchant site would be of help when the order is placed. Storing the state of 
the E-commerce transaction (like ordered/non-shipped, shipped, cancelled at 
the terminal would be of bene) t, especially if con) rmations were used. How this 
could be technically done in a WAP environment, for instance, is not clear for the 
moment.® Further, whether it is possible to develop only a small set of typical 
process sped) cations for the terminals remains to be seen. They should be so 
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simple that programming them using WMLScript in WAP environment would 
be feasible. 

Implementing end-to-end authentication and authorization at di erent pha- 
ses of the transaction would raise the security level of the mobile E-commerce. 
It is for further study, whether the SIM card-based authorization technology can 
easily be used in the context of advanced transactions. 

6 cl si s 

In this paper we have analyzed a few real examples and discussed what kinds of 
problems there are in the current E-commerce. It turned out that from structural 
point one can distinguish between two-party (customer-merchant or three-party 
(customer- merchant-bank/credit card company E-commerce transactions. It is 
probable that there might be even more parties involved in the future, when 
the contents providers step into the stage or the customer has to buy several 
products from di erent merchants in order to compile a functioning system. 

One of the ) ndings is that the regulatory frameworks seem to be di erent 
in Europe and USA and this has certain rami) cations for the technology and 
transactional mechanism that might be considered reasonable in this context. 
The regulatory frameworks and emerging de-facto standards must be studied 
further. Depending on how they evolve, it might or might not be possible to 
design a more or less homogeneous transactional support for mobile E-commerce. 

Another conclusion is that transactional mechanism must be closely related 
with the authentication and authorization mechanism and that the main bene) t 
of using them together is actually to guarantee security and other related pro- 
perties in the mobile environment. Eurther, the transactional mechanism and 
authentication and authorization mechanisms and their combination should be 
applicable in a *exible way. If the customer does not want to use them, he does 
not need to. The current systems we reviewed here are more or less lacking these 
kind of features, but the mobile E-commerce seems to need them. 

An important conclusion of this study is that transactional mechanisms are 
needed in two rather orthogonal directions. On the one hand, the individual 
work*ows implementing the business processes at the merchant and customer 
(and possibly at the third party like credit card company or bank should exhibit 
certain transactional properties, especially (semantic atomicity and durability. 
This guarantees that the work*ows reach an acceptable end-state in spite of 
failures. On the other hand, in order to ascertain that both customer, merchant, 
and possible further parties have the same view on the state of a E-commerce 
transaction, one needs distributed transaction guarantees, especially distribution 
atomicity. Based on these lower level goals the more higher goals, like goods 
atomicity and certi) ed delivery coined in [34], can be achieved. 

Mobility seems to pose one genuinely new problem. The location of the ter- 
minal at the point of time the order is placed might play a role as concerns the 
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legislation to be applied. It is still open, how this issue will be put aside or solved 
within the forthcoming regulatory framework(s 

Apart from the above peculiarity, mobile E-commerce is similar to the “) xed” 
E-commerce. The new issues to be solved are mainly caused by the communica- 
tion autonomy and miniature size of the hand-held terminals used to facilitate 
the E-commerce. Authentication and authorization require, for instance, a more 
careful treatment, because the hand-held terminals can be stolen at any point of 
time. Authorization should not be based on device identity, session, or similar 
concepts only, but genuine end-to-end user authentication should be enforced at 
critical moments. 

An important implementation level issue is how to achieve failure resilience 
in mobile terminals. Another issue is, how the authentication and authorization 
can be organized in a such a way that stealing a hand-held device would not 
jeopardize them. At least partial solution to this seems to be a special SIM card 
containing the private key of a person. 

Many aspect are still open in this ) eld. More research and experiments are 
still needed to ) nd out whether transactional mechanisms are really needed and 
how they should exactly look in the mobile E-commerce environment. Bluetooth 
technology [20] will evidently cause again new kind of considerations in this re- 
spect while allowing terminals and cash registers to talk directly to each other, 
also for the purpose of mobile E-commerce. 
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Abstract. Electronic Commerce is a rapidly growing area that is gai- 
ning more and more importance not only in the interrelation of busin- 
esses (business-to-business Electronic Commerce) but also in the ever- 
yday consumption of individuals performed via the Internet (business- 
to-customer Electronic Commerce). Since Electronic Commerce is a very 
interdisciplinary area, it has a lot of impacts to various communities. The 
goal of this paper is to identify and to summarize the impact of Electronic 
Commerce from a database transaction point of view and to highlight 
open problems in transaction management arising in Electronic Com- 
merce applications by reflecting the discussions of the working group 
“Transactions and Electronic Commerce” held at the TDD Workshop. 



1 Motivation 

The exchange of electronic data between companies has been an important issue 
in business interactions for quite a while. However, the recent proliferation of the 
Internet together with the rapid propagation of personal computers led to an 
enormous diversification and an ubiquity of businesses performed electronically. 

Since various participants, all having different requirements, operating in 
different, distributed and heterogeneous environments, etc., are encompassed in 
Electronic Commerce interactions, a couple of problems arise in this context. 
This affects communication infrastructures as well as business models and cryp- 
tography but also aspects of information management, information integration, 
and transaction processing. In the latter case, fair interactions between all par- 
ticipants in the presence of potentially malicious parties, or the enforcement of 
complex business processes with certain transactional execution guarantees, all 
to be supported in an holistic way by an appropriate Electronic Commerce infra- 
structure are examples of problems to be dealt with in order to provide feasible 
solutions. 

The paper summarizes the working group “Transactions and Electronic Com- 
merce” held at the TDD’99 workshop^ and points out the benefits transactions 

^ The participants of this working group were (in alphabetical order): N. Aoumeur, 
H. Balsters, A. Berztiss, B. de Brock, A. Fent, S. Gangarski, G. Guerrini, E. Kind- 
ler, C. Leon, K. Nagi, J. Pinto, A. Popovici, E. Rodriguez, G. Saake, M. Saheb, 
R. Schenkel, K.-D. Schewe, H. Schuldt, K. Schwarz, C. Tiirker, J. Veiialainen, and 
C.-A. Wichert. 
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can offer protocols and applications in Electronic Commerce but also the open 
problems that still exist in this area. In Section 2, the impact of the Internet for 
Electronic Commerce is discussed. Section 3 generalizes the idea of distributed 
transactions by addressing transactional workflows. In order to cope with the 
special requirements of Electronic Commerce, agent-based systems and archi- 
tectures (Section 4), secure transactions (Section 5), and legal issues (Section 6) 
are discussed. Finally, a summary and conclusion is given in Section 7. 

2 Internet Transactions 

One important characteristics of most Electronic Commerce interactions is that 
they take place via the Internet. Therefore, atomic commitment is an important 
requirement for distributed Electronic Commerce transactions. When executing 
transactions over the Internet, heterogeneity and the lack of transactional com- 
munication protocols have to be considered. Unlike traditional two phase commit 
(2PC) [3] solutions, this limits the applicability of common approaches such as 
the XA application programming interface of the X/Open standard [3] and ne- 
cessitates further efforts. 

The Transaction Internet Protocol (TIP) [4] proposed by Tandem and Micro- 
soft aims to overcome the above mentioned problems in Internet transactions. 
Therefore, it is also highly appropriate for Electronic Commerce applications. 
The main idea of the TIP is the Two Pipe Model, the separation of applica- 
tion communication (which can take place, for instance, via non-transactional 
protocols such as HTTP) and communication at transaction manager level. 

Although atomic commitment is an important building block in Electro- 
nic Commerce transactions, it is considered not to be sufficient. Additionally, 
dynamic aspects and more flexible mechanisms for ensuring the correctness of 
complete Electronic Commerce workflow processes (transactional workflows) are 
required. 



3 Transactional Workflows 

Interactions in Electronic Commerce can be characterized by their well-defined 
structure but also by their inherent complexity and their long duration. Workflow 
processes are thus an appropriate means to encompass all dependencies within 
Electronic Commerce interactions. As identified in [11], transactional properties 
are crucial requirements in order to make these processes viable for Electronic 
Commerce applications. These transactional properties have to consider both the 
correctness of single processes but also the correctness of concurrent executions of 
processes when accessing shared resources (e.g., as addressed in the transactional 
process management approach [7] which is also exploited for business-to-business 
Electronic Commerce applications [1]). Important aspects in this area are the 
exploitation of the flexibility offered by workflow technology with respect to 
failure handling by alternative executions but also the ability to be able to prove 
the correctness of processes specified. 




Transactions and Electronic Commerce 



227 



Payment processes are an important application of transactional workflows 
in Electronic Commerce [8]. However, more general approaches covering further 
phases of trade interactions such as post-sales or negotiation are required in 
order to provide complete and holistic solutions [5] . 

4 Agents and Transactions 

Agent-based systems are widely considered to be an appropriate means to cope 
with the ever-growing complexity of large information systems by decoupling 
them into small and easily manageable components, the so-called agents. Es- 
pecially in Electronic Commerce applications, agents are seen as an important 
means to support the different participants such as customers or (information) 
brokers. 

However, although agent-based systems may be characterized by the mobi- 
lity of single components and the dynamics emerging by continuously adding 
and removing components, transactional properties for agent interactions are 
crucial. To this end, some problems in the intersection of agent technology and 
transaction management have to be addressed: 

— The correctness of single agents — in spite of their hierarchical structure 
where decisions and plans are made at different levels of abstraction — is a 
basic prerequisite for the correctness of agent-based systems. Agent archi- 
tectures and agent implementations therefore have to be enhanced by appro- 
priate transactional support (intra-o^ent transactions) such as the multilevel 
transaction approach presented in [6] . 

— Additionally, concurrency in agent interactions has to be dealt with. The- 
refore, inter-agent transactions are required in order to ensure the overall 
correctness in agent-based systems while considering at the same time the 
dynamics of these systems. 

— Finally, in the context of transaction management, agent technology may be 
useful to “databaseify” components and applications [12,9]. 

5 Secure Transactions 

According to the discussion of the previous sections, all Electronic Commerce ap- 
plication scenarios encompass transactional interactions implemented either by 
pre-deflned (and eventually dynamically modified) processes or, in some cases, 
on an ad-hoc basis (according to the needs of the respective business-case, even- 
tually implemented by appropriate agents, or to the local legislation). Firstly, 
cryptographic mechanisms have to be tightly coupled to all kinds of information 
exchange in Electronic Commerce transactions in order to to integrate autho- 
rization and authentication in a seamless way (see, for instance, the SEMPER 
project [10]). Secondly, higher guarantees then just atomicity and isolation are 
required: it is necessary to protect sensible information and to ensure that no 
participant can cheat the others. 
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For the latter purpose, an extended notion of transaction would be needed. 
This transaction model should use non-repudiable protocol messages in order 
to prove (in case of failure) which parties corrupted the protocol. Moreover, 
additional information should be added to the protocol to enable certain trusted 
participants to check the correctness of a commit decision. 

By using such a model, scenarios like the ones mentioned above can be imple- 
mented and tested much easier. Additionally, since the security of the customer 
is moved into the transactional protocol, a much wider area of applications can 
profit of using such a model. 

Of course, there are several (justified) questions which arise in such an ex- 
tended model: 

— Which should be the right formalism to describe such secure transactions? 
Without a formal description for a secure transaction, automatic checking 
of correctness and security requirements would not be possible. 

— If trusted parties are encompassed in the secure transactions, which partici- 
pants, and under what conditions, are to act as such parties. 

— Is it possible to ensure anonymity of certain participants in secure transac- 
tions? 



6 Legal Issues 

There is no secret that the laws and business regulations are not the same in 
different countries or national communities. Such differences, as for instance 
between the USA and Europe, sometimes seem to be negligible. However, they 
may have a strong impact on the Electronic Commerce software developed on 
both sides of the Atlantic. In Europe, for instance, customers have to pay after 
they receive their service, in USA they have to pay prior to the receipt. This leads 
to customer-centric Electronic Commerce scenarios in Europe and merchant- 
centric ones in the USA. 

The solution in these area would be Electronic Commerce frameworks, with 
hot-spots reflecting the legislation differences. This would permit homogenize the 
landscape of Electronic Commerce solutions and would lead to a higher degree 
of integration of companies on the two continents. However, in order to be able 
to do this, a better understanding of the concrete legislation differences as well 
as a clear classification of such scenarios is needed. 

In addition to the different business models exploited, varying taxes applied 
in different countries or states also reinforce the heterogeneity problem encoun- 
tered in Electronic Commerce applications and necessitate appropriate support 
in coherent solutions. 

In certain cases, persistence of data gathered during Electronic Commerce 
transactions is required in order to fulfill legal regulations associated with the 
requirement to a posteriori verify the fairness of transactions. However, in this 
context, persistence also raises ethical problems in terms of privacy and the 
extent this data can subsequently be exploited for. 




Transactions and Electronic Commerce 



229 



7 Conclusion 

Taking a close look at Electronic Commerce applications, it becomes clear that 
transactional execution guarantees for the diverse interactions of all involved 
participants are an important prerequisite. 

The following topics have been identified as key properties of Electronic Com- 
merce solutions form a transaction point of view: 

Language Support (Specification): 

An appropriate higher-order specification language (e.g., like transaction lo- 
gic [2]) is needed to model the interactions and dependencies in Electronic 
Commerce transactions as well as to add the additional properties these tran- 
sactions are supposed to have (for instance w.r.t. security), thus integrating 
all essential aspects of Electronic Commerce transactions at specification le- 
vel. 

Validatiou: 

Based on the specification of Electronic Commerce transactions, validation 
mechanisms should be added to the modeling environment in order to be 
able to formally prove the overall correctness, the absence of contradictions, 
or to validate certain properties of whatever transaction has been specified. 

Electronic Commerce Infrastructure: 

Aside of specification tools, support for the execution of Electronic Com- 
merce transactions, enforcing all specified properties, is required. In order 
to support different business models, different interaction paradigms, etc., a 
modular, framework-oriented infrastructure for the generation of Electronic 
Commerce transactions should be provided. This would offer the possibility 
to plug together pre-defined building blocks; the advantage of this approach 
is that already validated (“certified”) components could extensively be re- 
used. Additionally, a tight link to the modeling environment by automati- 
cally transforming a specification into an executable Electronic Commerce 
transaction would complete this effort. 

All viable Electronic Commerce systems should offer the above mentioned cha- 
racteristics in some way. However, in order to provide feasible and coherent sy- 
stems and tools for Electronic Commerce applications being fit for commercial 
use, collaboration with other communities is required since — aside of transac- 
tional properties — a variety of other aspects not discussed in this paper have 
also to be considered. 
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Abstract. The concept of a transaction, highly signihcant in the context 
of data bases, is broadened to make it refer to any atomic operation that 
changes the state of a software system or its environment, or initiates a 
control action. This leads us to consider software systems as composed 
of transactional and procedural computations. We discuss the specifi- 
cation of transactional software, and introduce a mechanism for linking 
transactions into processes. We also raise several issues relating to tran- 
sactional computing that were the basis for discussion at the workshop, 
and include comments by participants. 



1 Introduction 

For some time there has been a realization that some computations are fun- 
damentally different from others. For example, Harel [10] talks of reactive sy- 
stems and transformational systems, where reactive systems are driven, at least 
to some extent, by external events. Wegner [27] distinguishes between interac- 
tive and algorithmic computation, and interactive computation is investigated in 
some detail by Kurki-Suonio and Mikkonen [14]. It is also being understood that 
different modes of computation may be needed for a single application. Thus, 
Stonebraker et al [24] suggested quite a while back that procedures should be 
introduced into data bases. This suggestion has evolved into the active data base 
concept (see, e.g., [16]). The role that transactions play in information systems 
is studied in [25,2]. Workflow systems are particularly dependent on transactions 
[18,11]. Our purpose here is to explore these developments. 

We propose to group computations into two classes, transactional and proce- 
dural, where a software system is likely to contain components belonging to both 
classes. In so doing we shall arrive at an interpretation of a transaction that is 
both broader and narrower than the one given to it by the data base community. 
By introducing the two-way partition we identify transactional and procedural 
computation as two distinct foci for research. However, our primary purpose is 
to investigate the nature of transactional computation, and to define a number 
of discussion topics. We hope that this will contribute to a closer interaction and 
cooperation of researchers, particularly those studying information systems and 
data bases. 
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Section 2 is an examination of the differences between transactional and pro- 
cedural computation. In Section 3 we introduce our interpretation of a transac- 
tion, and discuss how queries fit into our framework. Sections 4 and 5 describe an 
approach to the specification of transactions. In Section 6 we introduce a number 
of topics that were the basis for a discussion session at Schloss Dagstuhl, and 
comments by participants form the main contribution of this section. 



2 Two Modes of Computation 

Looking back at the history of computing, the earliest computations were sim- 
ple transformations of inputs into outputs. Soon there arose the realization that 
data files could be preserved from one instance of a computation to another, 
and that different applications could make use of the same file. Operating sy- 
stems were soon providing file management systems, and separate data base 
management systems arose. What characterizes this trend is the evolution of 
persistent memory: from no persistent memory at all, to rather rigid data files, 
to highly complex data bases. Persistent memory is one characteristic of transac- 
tional computation, but the more important characteristic is that transactional 
computation brings about changes in this persistent memory. 

Persistent memory is irrelevant for procedural computation. There the con- 
cern is with a transformation of an input into an output. Let the transformation 
be effected by a device F that accepts an input x and produces an output f{x). 
One example is the cosine function, which, given angle x, produces cos{x). Of 
course, both x and f{x) can be composite data elements. Now, quite often the 
input x will be picked up from a data base, and the result f{x) deposited in 
a data base, but as concerns F, it is immaterial where the inputs come from, 
or what happens to outputs. Procedural computation obeys what we shall call 
I/O semantics, i.e., the specification of such a computation has to describe the 
output, and indicate how the output is related to the input. Time is irrelevant. 

Transactional computation obeys state-transition semantics. Invariants de- 
scribe valid states of a system. They may also indicate impermissible state tran- 
sitions. For example, we may have a situation in which borrowers are not per- 
mitted to borrow books from a library if they owe money to the library. Then 
a state in which a borrower holds four books and owes money is valid, and so 
is the state in which the borrower holds five books and owes money, but not a 
transition from the first state to the second. In such a case the borrowing tran- 
saction can be equipped with a precondition, which, if not satisfied, stops further 
borrowing: Owes(borr) = 0. An alternative view, adapted from [9], considers a 
state space U , and defines a transaction to be a member of a state change rela- 
tion R C U X U, i.e., the pair < S, S' >, where S and S' are members of U. A 
constraint on transactions can then be defined as an invariant: 

V < S', S" >G R : S.Owes{borr) — >■ S' .HasOut{borr) < S.HasOut{borr), 
where HasOut{borr) returns the number of books being held by borr. Still ano- 
ther approach is to define an invariant in terms of temporal logic, which intro- 
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duces a further difference between the two modes of computation: time can be 
important in transactional computation. 

Temporal logic has been used extensively in specification (see, e.g., [13,17,23]), 
and it is well suited when there is no need to introduce real time, such as for 
precedence ordering of tasks. It has even been claimed that real time has no 
place in specifications [26]. However, sometimes it is necessary to deal with real 
time even in early requirements stages, such as when government regulations set 
strict time limits on reporting obligations by banks. Temporal logic, despite its 
elegance, need not then be the best approach. 

A problem can in general be addressed in different ways. The simplest so- 
lution is usually purely transactional, from which we can advance to solutions 
in which there is an increasing dependence on procedures. We illustrate this 
by a simple example: a date and place are to be selected for a meeting, and 
participants are to be registered for the meeting. 

At the lowest level a steering committee selects a date and place for the 
meeting, and an information system merely performs registration transactions. 
At the next higher level the steering committee selects a date, but now considers 
several places for the meeting. A procedure selects the place for which the travel 
costs of the participants are minimized. A third solution is obtained in stages. 
In the first stage the steering committee polls prospective participants, asking 
for preferred dates and places, with the preferences ranked or given numerical 
weights. An algorithm then selects a time and place that maximizes an objective 
function. Our example shows that an information system may have to allow both 
transactional and procedural computations. 

Such refinement of a design is not limited to information systems. Consider a 
controller for a set of elevators in a large building. Suppose that a person on floor 
5 requests service by pressing the UP-button. The simplest, purely transactional 
design puts fioor 5 on the up-section of the pick-up agenda of every elevator, and 
all elevators will stop at this fioor. The next design, although more complex, is 
still purely transactional: once an upward moving elevator stops at fioor 5, this 
fioor is removed from the relevant agenda of every elevator. The design becomes 
more elaborate and partly procedural when scheduling algorithms are added 
in — they are to improve efficiency by moving empty elevators to strategically 
selected holding fioors and by selecting specific elevators for the pick-up of riders. 
Efficiency is improved still further by fine tuning of this design when the actual 
usage history of the elevators becomes available. 



3 Transactions Redefined 

In traditional data base applications a transaction is a data base update or a 
query evaluation. We shall both broaden this interpretation and make it narro- 
wer. The broadening allows “updates” to relate not just to a data base, but also 
to the environment in which a software system may be embedded. The narrowing 
excludes query processing. 
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We regard the processing of a query as a typical procedural task. First, it does 
not bring about a state change of the system. Second, it obeys I/O semantics — 
the input is a data base and a predicate defining a result the user is interested 
in, the output is the result, and the computation consists of operations applied 
to the data base to produce the result. Moreover, at a sufficiently high level 
of abstraction, there are cases in which it is immaterial whether a query will 
refer to a data base or be evaluated by an algorithmic procedure. Suppose one is 
interested in the times of sunrise for Magdeburg in the year 2000. One approach 
would be to list the 366 entries in a table, and to look up the table. The other 
would be to evaluate the sunrises by means of an algorithm. 

The difference between updates and queries is fundamental. In updating, i.e., 
changing the state of a system, there must be a system to update, and it must 
be possible to describe states of the system. Thus, if I am adding sunrises for 
the year 2001 to my year-2000 system, I must point to a specific system that is 
to be augmented, and I must know how this system determines its responses. 
However, when a user puts a query to this system, there should be no need 
for the user to know whether the response is obtained by means of look-up or 
algorithm. Indeed, if I need the sunrise time for Magdeburg for September 18, 
2000, I do not necessarily have to go to my system. I could also try to get this 
information by means of Internet search. Of course, with existing query systems, 
such as SQL, a detailed knowledge of a data base is needed. 

Thus, query evaluation follows precisely the procedural model: input x (the 
query)is processed by device F (the query processing system), resulting in f{x) 
(the answer to the query). A control action seems to be conceptually similar: the 
system receives an input x from a sensor; a device F determines a response f{x), 
which is conveyed to an actuator. The point here is that a dynamic response, 
i.e., a change imposed on a controlled system, is to be made only if F detects 
a change in the controlled system. This means that F must be able to refer 
to earlier sensor readings, which means in turn that some kind of data base, 
however rudimentary, must be maintained, and that the data in this data base 
undergo changes. It should be kept in mind that differences between information 
systems and control systems are vanishing — in which category would one put 
programmed stock market trading by a mutual fund? 

This effect suggests that the interpretation of the concept of transaction 
should be quite broad. A control system is embedded in a host environment, 
and its purpose is to change the state of this environment. Just as a transaction 
of an information system changes the state of a data base, so a transaction of 
a control system affects the state of the device or system of devices under its 
control, and a transaction executed by a programmed trading system affects the 
state of the portfolio held by the mutual fund. But there are differences. Most 
transformations of information systems are initiated by users, and most transac- 
tions of systems that control devices are initiated by the systems themselves in 
response to sensor inputs. Such differences will be discussed in greater detail in 
the next section. 
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Note also that research on active data bases deals with the introduction of 
aspects of information systems into data bases, thus reducing the distinction 
between the two. There remain concerns that relate purely to data bases, such 
as query optimization and data base locking, and concerns that relate purely to 
information systems, such as enterprise analysis. However, the two fields have 
much in common, and an in-depth study of the two modes of computation as 
they relate to information systems should be a joint undertaking of data base 
and information systems researchers and developers. 



4 Specification of Transactions 

Let us now take a closer look at transactions. We consider a transaction to be 
an atomic operation that is initiated by a user, initiated by the system itself, or 
initiated by a user after being prompted by the system to do so. Let us call the 
three types user transactions, system transactions, and prompted transactions, 
respectively. The withdrawal of money from a bank account by a customer is 
a user transaction, but an automatic monthly salary deposit into the account, 
because it requires no intervention by people, is a system transaction. For an 
example of a prompted transaction we turn to the refereeing of conference papers. 
The system should have the knowledge that a paper needs referees. Hence, after 
the details of a submitted paper are entered into the conference data base, the 
system is able to prompt the chairperson of the program committee that referees 
are needed, but the actual referee selection is left to the chairperson. 

An important concern in defining specifications relates to constraints. In 
Section 2 we discussed the example of a library that stops further borrowing when 
money is owed. This can be expressed as a system invariant or as a precondition 
attached to the borrowing transaction. Our preference is for the latter because 
such preconditions are nearly always easier to write and to understand than 
system invariants. The format of the specification of a transaction is thus to allow 
for preconditions. It also has to allow for data base changes (“data conditions”), 
and, as we shall discuss in the next section, there has to be a mechanism for 
chaining transactions together into a process (“signals”). Note that there are 
two types of signals: N-signals are raised in the normal case, i.e., when the 
preconditions are true; E-signals are raised when preconditions are false. In the 
latter case the data conditions are ignored, and the signals invoke an exception 
handler. A schematic format for the specification of a transaction is thus: 

TRANSACTION Name (Inputs) ; 

PRECONDITIONS Predicates ; 

DATACONDITIONS Predicates ; 

N-SIGNALS ; 

E-SIGNALS ; 

ENDTRANSACTION; 

Let us return to the library example. The specification of the borrowing tran- 
saction under this format is 
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TRANSACTION Borrowing{borr, book); 

PRECONDITIONS HasOut{borr) < 6; Owes{borr) = 0; 

DATACONDITIONS Borrower' (book) = borr; 

H asOut' (borr) = HasOut{borr) + I; 

State' (book) = borrowed; 

ReturnD ate' {book) = DateNow + Borr Period; 

E-SIGNALS 

HasOut{borr) > 6 — >■ {LimitExceeded)ON; 

Owes{borr) > 0 — >■ {MoneyOwed)ON; 

ENDTRANSACTION; 

For simplicity we assume all data to be in the form of sets or functions. This 
allows great flexibility in defining preconditions: a function value may come 
from a data base, it may be computed by means of an algorithm, or it can be 
obtained from an external Internet site. Here there are two preconditions. The 
function HasOut tells, for each borrower, how many books are currently out 
to this borrower. This number may not exceed 6. The function Owes returns, 
for each borrower, the amount of money the borrower owes to the library. The 
functions of the data conditions are necessarily mutable. Hence they have to be 
stored as entries in a data base, and data conditions define data base changes: 
the value of function Borrower for book after this transaction (indicated by a 
prime) will be borr; the value of function HasOut for borr after the transaction 
will be its value before the transaction increased by 1. The state of the book 
will be “borrowed”, and the book is to be returned a certain number of days 
after today’s date. Signals are used to send out messages that facilitate the 
combination of transactions into processes. Here we are dealing with an isolated 
transaction, so that there are no normal-case signals. 



5 Transactions and Processes 

Nearly every computer program implements some process, but there is great 
variety in the interpretation of what is meant by a process. A fairly extensive 
search through software engineering textbooks revealed that most did not have 
process in the index. Where they did, process was interpreted as something that 
converts an input into an output, or as a bubble in a data flow diagram, or as 
something that results in a software system. In [7] we define a business process as 
an ordered collection of tasks that is to achieve a value-adding objective within a 
finite time interval. To allow for control processes as well, particularly continuous 
processes, this definition should be generalized: a process is an ordered collection 
of tasks that is to achieve some objective. This basic definition can be expanded 
and elaborated — for one such elaboration see p.l9 of [18]. 

A definition is important because organizations such as business enterprises 
are increasingly being defined in terms of processes rather than their organizatio- 
nal structures. Davenport [8] among others, considers this a basic characteristic 
of business reengineering. Moreover, Davenport argues convincingly that the 
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management of the data on which a process is based should be the responsi- 
bility of the manager of the process. This seems to go against the philosophy 
of enterprise-wide data bases, but does not really do so. There can still be a 
central data base, but the transactions that effect changes in this data base are 
to be defined by the designers of the software systems supporting the individual 
business processes. 

A very simple instance of a process arises with the library example of Section 
4. For all borrowed books that have become overdue, i.e., with yesterday’s return 
date, the book is to be marked overdue, and a message is to be sent to the 
delinquent borrower. 

ACTION; 

@(08:15:)::FORALL(& € (borrowed)): 

ReturnDate{b) < DateNow — >■ 

{MarkOverdue{b), REMIND (“Send out late notice”: &)); 

ENDACTION; 

TRANSACTION MarkOverdue{book); 

DATACONDITIONS State' (book) = overdue; 

ENDTRANSACTION; 

At 8:15 every morning (the bare colon after the 15 tells that the clock to be 
used here is to have a resolution of one minute; 08:15:00 would require a clock 
with resolution of one second) every element of the set defined by the inverse 
of State with respect to “borrowed”, i.e., the set of all books that are currently 
borrowed, is to be examined. If the return date shows that the book is overdue, 
then a system transaction is invoked that is to change the state of the book to 
“overdue” , and a reminder is issued telling that a late notice regarding the book 
is to be sent out. Knowing b, a librarian can find out the name and address of 
the delinquent borrower, and the librarian is expected to send out the notice. 
This step could, of course, be easily automated. 

It is important to note that an action cannot define a data base change. 
Conversely, all time-related effects are confined to actions, and so is everything 
relating to task coordination. 

Our transaction-action model was first introduced in our specification lan- 
guage SF — for a brief introduction and earlier references see [7], but there a 
transaction is called an event, and the actions of this paper are called transac- 
tions. Actions can be started by a signal, or by a calendar or clock, and the 
initiation of a transaction by an action may be delayed. It is also possible to 
perform periodic monitoring of a system, and to initiate a transaction when the 
system is found to be in a particular state. 

An entire process composed of our transactions and actions can be repre- 
sented by a Petri net (actually a slightly modified time Petri net — for time 
Petri nets see [4]). In this net a transaction is represented by a place, a signal 
by a token, and an action by a subnet composed of places and transitions. This 
provides actions with formal semantics. 
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Space limitations prevent the presentation of the full SF syntax, but we are 
including the productions that relate to actions. Many formalisms for connecting 
transactions have been devised; two based on finite state machines are discussed 
in [10,2]. However, the partitioning of transactions into user, system, and promp- 
ted transactions, where system and prompted transactions are the responsibility 
of actions, is unique to our approach. So is the introduction of time constraints 
and delays, which can be expressed as time intervals rather than sharp values, 
with this feature having sound semantics based on time Petri nets. 



<Action>::= 



<Activator>::= 

<SigConstr>::= 

<Signal>::= 

<Delay>::= 

<TConstr>::= 

<ActPart>::= 

<Iterator>::= 

<PrimAct>::= 

<TransInvoc>::= 

<Prompt>::= 

<Reminder>::= 



ACTION [<ActionId>]; 

< Activator > : : [< ActPart >] * 

ENDACTION; 

<SigConstr>[<Delay>] o <TConstr> 

ON(<Signal> [,<Signal>]*)OFF 
<SigId>[(Exp [,<Exp>]*)] 

DELAY (<TimeExp> [,<TimeExp>]) 

@ (<TimeExp> [,<TimeExp>]) [: <SigConst>] 
[<BoolExp>: o <Delay>: o <SigConstr>: o <Iterator>: ]* 
<PrimAct> o (<PrimAct> [,<PrimAct>] *)]-!-; 
FORALL(<Id>G <SetExp>) 

<TransInvoc> o <Prompt> o <Reminder> 

<TransId> [(<Exp> [,<Exp>]*)] 

PROMPT(<TransId> [(<Exp> [,<Exp>]*)]) 

REMIND(“ <Textexpr>” [:<Exp> [,<Exp>]*]) 



Notation: Square brackets indicate that the item enclosed in the brackets is 
optional. If square brackets are followed by the symbol *, then the enclosed item 
may be present zero or more times. The symbol o indicates alternation, e.g., 
production A ::= B o C indicates that A may be rewritten as B or as C, and 
[B o Cj* stands for any number of Bs and any number of Cs, written in any 
order. All syntactic categories whose names end with Id (for Identifier) or Exp 
(for Expression) are left undefined. The vocabulary consists of terminals denoted 
by these syntactic categories, those symbols in the productions that consist of 
capital letters alone (e.g., ACTION, OFF), and the nine symbols in the braces: 
{(),:;€©“”}. Time expressions are discussed in detail in [5]. 

We explain the structure of actions by means of an example. In order to bring 
in nearly all the features of the language, the example is much more complex 
than any action one will ever need to define in practice. 

ACTION Sample; 

@(8:00:, 9:00:): ON{SigA, SigB{Switch, SetX))OFF:: 

Switch: DELAY(*I5*, *30*): ON(S'z 5 A)OFF: (TransA,TransB); 

FORALL(a; G SetX): PROMPT {TransC {x, _)); 

REMIND (“ Send out papers to reviewers ”); 

DELAY(4*0*): REMIND (“ Have papers been sent to reviewers? ”); 

ENDACTION; 
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Action Sample is initiated by a clock (with resolution of one minute) between 
8:00 and 9:00 in the morning. If at this time signals SigA and SigB are both 
on, they are switched off, and four ActParts are considered. If the signals are 
not both on, nothing further happens, and, if one of the signals is on, it remains 
in this state. If the boolean Switch, which has been brought in by SigB, is 
true, then, after a delay of between 15 and 30 minutes, but only if at that time 
SigA has again been raised, SigA is switched off and transactions Trans A and 
TransB are initiated. 

The FORALL on the next line is an iterator over all elements of the set 
SetX. Here, for each such element, transaction TransC is to be initiated. One 
of the arguments of this transaction is to be supplied by a user — this argument 
is represented by an underscore. Note that SetX has been brought in by SigB. 
The next two lines define reminders, one to be issued immediately, and the other 
after a delay of four hours. 

The components of an action are interpreted in terms of time Petri nets 
[5] . Then an entire action of arbitrary complexity can be represented by a time 
Petri net, as we show in [5]. Since UML activity diagrams are just streamlined 
basic Petri nets, SF process specifications can be represented by UML activity 
diagrams, but with time-related annotations added. Also, since Petri nets are 
considered well suited for specification of workflows (see p.ll9 of [18], and p.368 
of [29]), SF can be used as a workflow specification language, which we already 
noted in [7]. 

6 Discussion Topics 

By identifying transactional and procedural computations as the two principal 
modes of computing we have tried to give a basis for the examination of some 
recent trends in information systems and data base research. For example, ac- 
tive data bases can be regarded as transactional systems in which the triggers 
correspond to our actions. However, there are numerous unresolved questions re- 
garding transactional computation. We introduce some of them in what follows. 
These questions were the basis for a discussion session. Following each topic, 
comments by participants are given in full or in summary. Particularly thought- 
ful comments were submitted after the workshop by E.O. De Brock, G. Saake, 
H. Schuldt, and K.-D. Schewe. 

1 . What is the best way of combining specifications of transactions and pro- 
cedures? How suitable are our actions for this purpose? 

Comments: (a) For each system specification it is interesting to know what 
are the needs of the application area in terms of expressiveness. Then a certain 
language may be chosen according to individual preferences and understanda- 
bility. In general, I do not believe that there is one overall “best” solution for 
all application scenarios, (b) This question can only be answered after making 
precise the difference between transactions and procedures. I prefer to speak 
about atomic actions (comparable to a method invocation or an SQL update) 
and transactions composed of these actions as units of integrity preservation. 
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(c) It depends on the extent to which one wants to combine transactions and 
procedures. When a procedure forms the basis that is enhanced so as to make it 
transactional, then it is ok. (ATB: This comment is based on a view of procedu- 
res and transactions that differs somewhat from that advanced above, where we 
consider transactions to be more primitive than procedures. Of course, sooner or 
later, transactions have to be implemented, and then they become procedural.) 

2. Should the specification of transactions be explicit or implicit? Implicit 
specification means that not every transaction that will be needed is specified. 
Instead, a set of predicates defines permissible system states and state transi- 
tions. When, for example, a transaction puts the system into an impermissible 
state, the system is to be restored to a permissible state, but how this is done is 
left unspecified. 

Comments: (a) In the paper by de Brock in this volume the set of permissible 
system states is simply captured by a database universe U and the set of state 
transitions by a transition relation R on U. Under this approach it can be clearly 
and neatly specified what should happen in case a transaction leads into an im- 
permissible state — see the discussion following Theorem 1 of that paper, (b) 
Note that in traditional database systems in general you have to specify only 
your transaction; recovery is performed by the database management system 
transparently, (c) It is well known that there is no algorithmic solution for the 
transformation of an implicit specification with lots of static and dynamic con- 
straints into an (executable) explicit specification — see, e.g., [1]. This implies 
that whenever an implicit application appears to be suitable in terms of an in- 
tended application, there is a need for pragmatically driven refinement process. 
I would prefer to provide application-specific refinement calculi for that purpose, 
consisting of certain predefined and a priori correct refinement rules — such a 
system of rules for relational reification can be found in [19]. On the related 
issue of consistency enforcement, we have defined greatest consistent specifica- 
tions and variants thereof. This work includes a characterization of desirable 
properties in consistency enforcement and contains results on commutativity 
and compositionality [22,21]. On the other hand it shows the non-computability 
of any reasonable approach and the non-suitability of triggers, (d) If there are 
many types of transactions, then invariants may be a better solution, (e) Com- 
plete specification of all possible transactions is only possible in very restricted 
application areas, and is feasible only for safety-critical systems. However, in 
some scenarios it may be feasible to specify a set of a few generic transaction 
patterns that are to guarantee integrity instead of all transactions. A sound mix 
of those alternatives will enhance the reliability and performance of information 
systems even if it is not a 100 percent solution. Implicitly specified transactions 
require expensive monitoring of constraints. 

3. System transactions are candidates for implicit specification, but, since 
prompted and user specifications relate to the user interface, is it not essential 
that they be specified explicitly? 

Comments: (a) Yes, it is our experience that it is essential that they be speci- 
fied explicitly: a lot of misunderstandings will otherwise be left unnoticed, (b) In 
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most cases, user transactions have to be specified explicitly (e.g. by some applica- 
tion program linked to a database). But what if some user sitting on a SQL GUI 
types in his/her update, insert, etc. statements in an ad-hoc way? This suggests 
that it is not necessary to have a complete specification of a transaction in all 
cases, (c) It is my true belief that only a holistic view of an information system 
will lead to well designed information systems. This means that all aspects of a 
system are to be specified at the same time, taking care of interdependencies. In 
particular, the user interface is not something to be added on later or to be deri- 
ved automatically. A strategy which emphasizes “do first this, then that” is not 
adequate. (ATB: Comment c emphasizes that the traditional separation between 
specification, the “what”, and design, the “how”, may not be productive.) 

4. A dialogue system is transactional, but it consists of a very large number 
of very simple transactions. What is the best way of specifying such a system? Is 
a separate specification step even needed in the development of such a system? 

Comments: (a) The best way to develop dialogue systems is to use an inte- 
grated approach, as covered in, for example, our work on dialogue objects [20]. 
(b) A straightforward way is to specify all transactions independently (in fact, 
they are independent; otherwise there would be no separation) and have a (da- 
tabase) system enforce correctness. To this end, allowed system states (or, more 
easily, forbidden system states) have to be specified and have to be enforced 
(=invariants). (c) For large systems we need a modularized approach, such as 
proposed in object-oriented specification languages for information systems (see, 
for example, the language Troll [12]). This approach allows the separate spe- 
cification of small composable parts of an information system, called objects, 
which can be handled as small reactive and communicating systems. 

5. Generalizing the implicitness-explicitness issue, are there “best” models 
and notations, or is the acceptance of a particular model or notation determined 
by the personality of a user? This may depend on deep-seated traits such a 
left-brain or right-brain dominance. It has long been recognized that users have 
different attitudes to the form of a query language (see, e.g., [28]). Is one form 
really better than another, or is it merely more appropriate for a particular user? 

Comments: (a) What is relevant in specification is expressiveness and seman- 
tics. Which specific language (or graphical formalism, which is also a language) 
is used may then become a matter of taste and general conventions within a 
company or development project. If effective transformations are at hand, diffe- 
rent formalisms may be combined. Usage is a matter of pragmatics, (b) There 
is no ’’best” model well-suited for all application scenarios. 

6. Is it necessary to take special steps to assist communication across “per- 
sonality gaps”? In other words, should we develop tool support for a transfor- 
mation between textual (e.g., SF) and graphical (e.g., UML) representations at 
the specification level? 

Comments: (a) UML is nothing, because it is just syntax and pictures with- 
out clearly defined semantics. The same applies to OMT, Booch method and the 
other predecessors of UML. Usually, system design is organized within projects, 
and, with respect to a project, representation means are fixed. Then it is impor- 
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tant to emphasize the perspective of the user, i.e., to ask, how people will work 
with a system, (b) I agree that such special steps are needed. 

7. Does explicit specification of all transactions necessarily mean that system 
integrity is to be maintained by preconditions? The invariants that specify per- 
missible states and state transitions under implicitness constitute an equivalent 
mode of integrity control. However, one of the reasons for giving control over pro- 
cess data to the manager of the process is that this allows some preconditions 
of some transactions to be sometimes ignored at the discretion of the manager 
of the process. This raises two questions. First, how are such exceptions to be 
reconciled with corresponding invariants, should such invariants exist? Second, 
should we introduce a notation for preconditions that explicitly allows overrides 
for certain transactions, and, in case an override is to be effective for just a 
limited time interval, indicates the length of this time interval? 

Comments: (a) My answer to the first question is No, and I agree with the 
next sentence — see again the discussion following Theorem 1 in the paper by 
de Brock in this volume. My answer to the rest of the discussion is that there 
should be two transactions in such cases, one for managers and one for ordinary 
employees, or that the specification of the transaction contains a case analysis 
(for managers and non-managers). This avoids a lot of problems, including the 
ones mentioned, (b) In general, you may want to consider static, transition and 
dynamic constraints. In the first two cases there exist well known proof obliga- 
tions for consistency. Then the task is either to verify these conditions or to use 
consistency enforcement (see comments on 2). Exception handling is related to 
overspecification: (transition) constraints are too strong. In most case it is pre- 
ferably to leave a certain decision latitude to the user and to specify not what 
must be done, but instead, what is not allowed. In some sense this is exactly 
the spirit of Wegner’s co-inductive view on interaction. To my true belief this 
approach is best captured by dialogue objects (see comments on 4). The issue of 
“weak constraints”, possibly to be specified in deontic logic, is still a challenging 
research issue, (c) In my opinion, preconditions are too weak. When transactions 
have to transfer one allowed system state into another one, some kind of inva- 
riant has to exist that may be temporarily violated within one transaction but 
that has to be re-established at least at the end of a transaction, (d) No. That 
is just one approach to do the job. Other approaches are monitoring, query and 
update rewriting, active rules, etc. 

8. How are we to assist unsophisticated users in the definition of precon- 
ditions? Although preconditions are in general easier to write than “global” 
invariants, even people with a good understanding of formal logic are known to 
make errors in writing logical expressions. In [6] we consider a visual approach 
to defining the answer to a query, and we indicate that, in parallel with the 
construction of this answer, the system could generate a predicate. Could this 
approach be used to construct preconditions? A broader question relates to our 
ability to convert global invariants into preconditions, and vice versa. 

Comments: (a) My answer to the first question is that we (as analysts) have 
to do it and not, e.g., the librarians in your example. My reaction to the second 




Transactional Computation: Overview and Discussion 



243 



sentence is that it turns out to be a good check to translate your resulting 
logical expression back again to the natural language of the user, (b) In fact, 
communication with users in unavoidable and specifications have to be explained 
(see comments on 6). (c) Unsophisticated users should deliver examples and 
counter-examples that are to be transformed into logical expressions by analysts. 
Then we can start an iterative process of refinement/validation. The first step 
can even be supported by learning or data mining tools. 

9. Is there a difference between active data bases and information systems, 
and, if so, what is it? What, if any, difference is there between information 
systems and control systems? Medina-Mora et al [15] distinguish between ma- 
terial, information, and business process perspectives. To what extent can this 
classification assist in the specification of transactions? 

Comments: (a) I see information systems as covering a much broader ground 
than active databases. The latter are just one way of implementing an infor- 
mation system, (b) There is a significant difference between ADBs and ISs. The 
latter do not just comprise databases, but also application programs, constraints, 
user interfaces, etc. What is important is the purpose and the usage of a system 
including ergonomic aspects, preservation of user skills, etc. When we are talking 
about information systems we also mean their environment, which brings us to 
the consideration of aspects that are not computable, e.g. the human work with 
an IS. ADBs, on the other hand, merely capture a very limited amount of system 
dynamics, represented in the form of rules. There are significant limitations even 
w.r.t. problems that are being claimed as major applications, such as consistency 
enforcement, (c) There is no user interaction with ADBs; the qualifier “active” 
is therefore somewhat misleading. 

10. Do existing workflow management systems provide sufficiently effective 
mechanisms for the sequencing, coordination, and synchronization of transac- 
tions? 

Comments: (a) Commercial workflow management systems definitely not! 
They support some limited kind of failure handling and persistence, but, to the 
best of my knowledge, concurrency is not properly addressed by any of the com- 
mercial products. Within the WISE project of ETH at Zurich, we developed a 
process management system that is aware of concurrency, thus offering not only 
the persistent storage of process states and sophisticated failure handling mecha- 
nisms, but also the correct scheduling of processes accessing shared resources, (b) 
Workflow systems ignore the differences pointed out under 9. Human work, es- 
pecially how it differs from computable processes, is ignored. Wegner [27] shows 
the difference between interaction and computation. Stated differently, workflow 
systems lead to a completely wrong view of information systems. 

11. How relevant is speech/act and language/act research to the specification 
of transactions? It appears that speech acts have a very similar structure to our 
transaction/ action patterns (see, e.g., [3]). 

12. What exactly is a transaction? 

Comments: (a) In my opinion a transaction on a state space U is simply a 
function from U into U — see Definition 1 in the paper by de Brock in this vo- 
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lume. (b) In my traditional thinking, transactions are atomic and isolated state 
changes (encompassing a set of basic operations), bringing a system from one 
consistent state to another (not necessarily different) one. The actions as descri- 
bed in the paper are in my opinion much more: they correspond to EGA rules 
of active databases. Therefore, they are not only actions but do also indicate 
when these actions are to be executed, (c) The standard use of the notion “tran- 
saction” is an execution of a database program reduced to database-relevant 
operations. In the paper the term is used more or less in the sense of an atomic 
database program, (d) There does not appear to be a clear understanding of 
how a transaction differs from an event, (e) A transaction is any operation that 
depends on a system state, which would include query processing. Procedures 
are state-independent, (f) A transaction is an explicitly specified transformation 
of the database state that preserves integrity. 

References 

1. S. Abiteboul and V. Vianu. A Transaction-based Approach to Relational Database 
Specification. Journal of the ACM, 36:758-789, 1989. 

2. B. Babin, F. Lustman, and P. Shoval. Specification and Design of Transactions 
in Information Systems: A Formal Approach. ACM Transactions on Information 
Systems, 17:814-829, 1991. 

3. M. D. Beer, T. J. M. Bench-Capon, and A. Sixsmith. Dialogue Management in 
a Virtual College. In Database and Expert System Applications, Proc. of the 10th 
Int. Conf, DEXA’99, pages 521-530, Lecture Notes in Computer Science, Vol. 
1677, Springer- Verlag, 1999. 

4. B. Berthomieu and M. Diaz. Modeling and Verification of Time Dependent Systems 
Using Time Petri Nets. IEEE Transaction on Software Engineering, 17:259-273, 
1991. 

5. A. T. Berztiss. A Linkage Mechanism for Transactions. Available from the author. 

6. A. T. Berztiss. The Query Language Vizla. IEEE Transactions on Knowledge and 
Data Engineering, 5:813-825, 1993. 

7. A. T. Berztiss. Software Methods for Business Reengineering. Springer- Verlag, 
1995. 

8. T. H. Davenport. Process Innovation: Reengineering Work through Information 
Technology. Harvard Business School Press, 1993. 

9. B. de Brock. Foundations of Semantic Databases. Prentice Hall, 1995. 

10. D. Harel. Statecharts: a Visual Formalism for Complex Systems. Science of Com- 
puter Programming, 8:231-274, 1987. 

11. M. Jackson and G. Twaddle. Business Process Implementation: Building Workflow 
Systems. Addison- Wesley, 1997. 

12. R. Jungclaus, G. Saake, T. Hartmann, and G. Sernadas. Troll - A Language 
for Object-Oriented Specification of Information Systems. ACM Transactions on 
Information Systems, 14(2):175-211, April 1996. 

13. F. Kroger. Temporal Logic of Programs. Springer- Verlag, 1987. 

14. R. Kurki-Suonio and R. Mikkonen. Harnessing the Power of Interaction. In H. Ja- 
akkola, H. Kangassalo, and H. Kawaguchi, editors. Information Modelling and 
Knowledge Bases X, pages 1-11, lOS Press, 1999. 




Transactional Computation: Overview and Discussion 



245 



15. P. Medina-Mora, T. Winograd, R. Flores, and C. F. Flores. The Action Workflow 
Approach to Workflow Management Technology. In J. Turner and R. Kraut, edi- 
tors, Proc. 4th Conf. Computer-Supported Cooperative Work, pages 281-288, ACM 
Press, 1992. 

16. T. A. Mueck. Active Databases: Concepts and Design Support. In Advances in 
Computers, pages 107-189, Vol. 39, Academic Press, 1994. 

17. G. Saake. Descriptive Speciflcation of Database Object Behaviour. Data & Kno- 
wledge Engineering, 6(l):47-74, 1991. 

18. T. Schael. Workflow Management Systems for Process Organization, Ijecture Notes 
in Computer Science, Vol. 1096. Springer- Verlag, 2 edition, 1998. 

19. K.-D. Schewe. Specification and Development of Correct Relational Database Pro- 
grams. Technical report, TU Clausthal, 1995. 

20. K.-D. Schewe and B. Schewe. Integrating Database and Dialogue Design. To 
appear in Knowledge & Information Systems. 

21. K.-D. Schewe and B. Thalheim. Limits of Rule Triggering Systems for Integrity 
Maintenance in the Context of Transition Specifications. Acta Cybemetica, 13:277- 
304, 1998. 

22. K.-D. Schewe and B. Thalheim. Towards a Theory of Consistency Enforcement. 
Acta Informatica, 36:97-141, 1999. 

23. A. P. Sistla and O. Wolfson. Temporal Triggers in Active Databases. IEEE Tran- 
sactions on Knowledge and Data Engineering, 7:471-486, 1995. 

24. M. Stonebraker, J. Anton, and E. Hanson. Extending a Database System with 
Procedures. ACM Transactions on Database Systems, 12(3):350-376, September 

1987. 

25. M. Thorin. Real-time Transaction Processing. Macmillan, 1992. 

26. W. M. Turski. Time Considered Irrelevant for Real-time Systems. BIT, 28:473-486, 

1988. 

27. P. Wegner. Why Interaction is more Powerful than Algorithms. Communications 
of the ACM, 40(5):80-91, May 1997. 

28. C. Welty and D. W. Stemple. Human Factors Comparison of a Procedural and a 
Nonprocedural Query Language. ACM Transactions on Database System, 6:626- 
649, 1981. 

29. M. Weske and G. Vossen. Workflow Languages. In P. Bernus, K. Merlins, and 
G. Schmidt, editors. Handbook on Architectures of Information Systems, pages 
359-379, Springer- Verlag, 1998. 




Author Index 



Aoumeur, N., 91 

Bertino, E., 67 
Bertossi, L., 112 
Berztiss, A. T., 231 

De Brock, B., 150 
Doucet, A., 130 

Pent, A., 45 
Freitag, B., 45 

Gangarski, S., 130 
Guerrini, G., 67 

Karoui, R., 167 
Kindler, E., 26 

Leon, G., 130 



Montesi, D., 67 

Pinto, J., 112 
Popovici, A., 193, 225 

Rukoz, M., 130 

Saheb, M., 167 
Schek, H.-J., 193 
Schenkel, R., 1 
Schuldt, H., 193, 225 
Sedillot, S., 167 

Veijalainen, J., 203 

Weikum, G., 1 
Weifienberg, N., 1 
Wichert, G.-A., 45 
Wu, X., 1 




